01 linear algebra
TRANSCRIPT
-
8/3/2019 01 Linear Algebra
1/27
1 VECTOR SPACES 1
Linear Algebra
1 Vector Spaces
The multiplication of a vector by a constant and the addition of two
vectors are familiar ideas; their abstraction and generalization lead to
the concept of vector spaces.
x
x
2x
y
x + y
Definition 1. A vector space is a set V of elements called vectors satis-
fying the following axioms.
1. For every x , y , z V, there is an operation called vector addition,
such that
(a) x + y = y + x (commutative);
(b) x + (y + z) = (x + y) + z (associative);
(c) there exists in V a unique vector 0 (called the zero vector) such
that x + 0 = x for every x V;
(d) for every x V there exists a unique vector x such that
x + (x) = 0.
2. For every x, y V and every , F, where F is a field, there is
an operation called scalar multiplication, such that
(a) (x) = ()x (associative);
(b) 1x = x for every x V, where 1 is the unit element ofF under
multiplication;
C.P. Kwong
-
8/3/2019 01 Linear Algebra
2/27
1 VECTOR SPACES 2
(c) (x + y) = x + y (distributive with respect to vector addi-
tion);
(d) ( + )x = x + x (distributive with respect to scalar addi-
tion).
Remark. Without giving a formal definition of a field, we simply note that
the set of all real numbers, denoted as R, is a field equipped with the
usual arithmetics of addition/subtraction and multiplication/division.
The unit element of this field is exactly the number "1". The set of all
complex numbers is another example of field. Sometimes we say "V is
a vector space over F" to emphasize the relationship between V and its
underlying field F.
Example 1. (The n-tuple space, Fn.) Let F be any field and let V be
the set of all n tuples x = (u1, u2, . . . , un) of scalars ui F. If y =
(w1, w2, . . . , wn) with wi F, define the addition ofx and y as
x + y = (u1 + w1, u2 + w2, . . . , un + wn)
and the multiplication of x by a scalar F as
x = (u1, u2, . . . , un).
It can be proved that the defined operations satisfy the axioms of a vector
space and hence V is a vector space over F.
Example 2. (The space of m n matrices, Fmn.) Let F be any field and
m, n be positive integers. The set of all m n matrices with elementsin F is a vector space under the usual matrix addition and multiplication
of a matrix by a scalar.
Example 3. (The space of continuous functions, C[a,b].) Let V be the
set of all real-valued, continuous functions of t, t [a,b].
C.P. Kwong
-
8/3/2019 01 Linear Algebra
3/27
1 VECTOR SPACES 3
x(t)
y(t)
ta b
Define, for x, y V and R, the following operations:
x + y = x(t) + y(t),
x = x(t),
where addition of two continuous functions and multiplication of a con-
tinuous function by a real scalar are defined in the usual point-wise man-
ner. Then, V is a vector space overR. Note that ifx and y are continuous
real-valued functions over [a, b] and is real, so are x + y and x.
Definition 2. A vector x V is said to be a linear combination of the vec-
tors y1, y2, . . . , y n V provided that there exist scalars 1, 2, . . . , n
F such that
x = 1y1 + 2y2 + + nyn =
ni=1
iyi. (1)
Definition 3. Let V be a vector space over F. The distinct vectors x1, x2, . . . ,
xn V are said to be linearly dependent if there exist scalars 1, 2, . . . ,
n F, not all of which are zero, such that
n
i=1 ixi
=0. (2)
The vectors that are not linearly dependent are linearly independent.
Remark. x1, x2, . . . , xn V are linearly dependent implies that every xi
in the set {x1, x2, . . . , xn} can be expressed as the linear combination of
the remaining vectors in the set.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
4/27
1 VECTOR SPACES 4
Given the following two vectors on a plane:
12
It seems that any vector x on this plane can be written as x = 11 + 22
for some 1, 2 R. For example, in the following diagram, x = 0.71
0.52.
12 0.71
0.52
x
However, the following 1, 2 cannot perform the same function:
1
2
Definition 4. A basis in a vector space V is a set of linearly independent
vectors such that every vector in V is a linear combination of this set of
vectors. The number of vectors that constitutes a basis is called the di-
mension ofV, denoted dim V. If dim V is finite, V is a finite-dimensionalvector space.
Definition 5. A set of vectors {i} is said to span a vector space V if every
vector in V can be written as a linear combination of{i}. (Note that {i}
may not be linearly independent and hence may not be a basis.)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
5/27
1 VECTOR SPACES 5
Given a basis {1, 2, . . . , n} for an n-dim vector space V, a vector
x V can be written as
x = 11 + 22 + + nn, i F (3)
[1 2 n] is called the coordinate vector of x in the basis formed
by {1, 2, . . . , n}.
A vector is "free", "floating", if no basis is specified:
x
The vector is "fixed" whenever a basis is given:
x
1
2
It is obvious that there exist many bases for a given vector space and
different coordinator vectors result from different bases for the same
x. What are the relationships between these coordinator vectors? The
answer is: Two coordinator vectors are related by a coordinate transfor-
mation effected by an n n matrix with elements in F.
Let {1, 2, . . . , n} be a set of basis for V and {1,
2, . . . ,
n} be an-
other set of basis for the same V. Thus a vector x V can be written
either as
x = 11 + 22 + + nn, i F , (4)
or
x = 11 +
2
2 + +
n
n,
i F . (5)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
6/27
1 VECTOR SPACES 6
Since 1, 2, . . . ,
n are vectors, we can write
1 = a111 + a212 + + an1n,
...
n = a1n1 + a2n2 + + annn,
aij F , (6)
i.e.,
[1 n] = [1 n]
a11 a12 a1n
a21 a22 a2n...
an1 an2 ann
= [1 n]A,
(7)
where A is an n n matrix of scalars in F. However, since
x = [1 n]
1...
n
= [1 n]
1...
n
, (8)
therefore
[1 n]A
1...
n
= [1 n]
1...
n
. (9)
This last equation holds for any set of basis vectors {1, 2, . . . , n} and
hence
1...
n
= A
1...
n
. (10)
Thus the matrix A acts as a coordinate transformation that relates the
coordinates in the two bases for the same vector.
Definition 6. A subset W of a vector space V is a subspace of V if for
every pair x and y of vectors contained in W, every linear combination
x + y, , F, is also contained in W.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
7/27
1 VECTOR SPACES 7
Example 4. In the following figure, W1 is a 1-dimensional subspace and
W2 is a 2-dimensional subspace, ofR3.
W1
W2
It can be shown using its definition, that a subspace is itself a vector
space. Moreover, any subspace must contain the zero vector since for
any x W, x x is by definition in W.
Theorem 1. Let V be a vector space over the field F. The intersection of
any collection of subspaces of V is a subspace of V.
Proof. Let {Wi} be a collection of subspaces of V and W =
i Wi their
intersection. Since each Wi is a subspace containing the zero vector 0,
0 W. Let x, y W and , F. By the definition of W, x, y Wi for
all i. Because each Wi is a subspace, (x + y) Wi for all i. Therefore
x + y is again in W and W is a subspace by definition.
Definition 7. Let U and W be subspaces of a vector space V over F. The
sum ofU and W, denoted by U+ W, is the subset of V considering of all
sums u + w with u U and w W.
Remark. It is easy to show that U + W is a subspace of V.
Definition 8. A vector space V is said to be the direct sum of two sub-
spaces U and W, written as
V = U W , (11)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
8/27
1 VECTOR SPACES 8
if each v V has a unique representation
v = u + w, u U , w W . (12)
Moreover, W is called the algebraic complement ofU in V and vice versa.
Example 5. In the following figure V = R2 and U, W1, and W2 are sub-
spaces of V.
W1 W2
U
R2
We have
V = U W1
= U W2,(13)
and W1 is the "orthogonal" complement of U.
Theorem 2. Let V be a vector space over F, and U and W be subspaces
ofV. IfU + W = V and U W = 0, then V is the direct sum of U and W.
Proof. Given any v V. Since U + W = V, there exist u U and w W
such that v = u + w . Suppose there exist u U and w W such that
v = u + w. Then
u + w = u + w.
It follows that
u u = w w.
But u u U and w w W and therefore u u = w w U W.
Since U W = 0, u u = 0 and w w = 0 and hence u = u and
w = w. That means the representation v = u + w is unique. The
theorem is proved.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
9/27
2 MAPPINGS 9
Theorem 3. Let V be a finite-dimensional vector space over F. IfV is the
direct sum of U and W, then
dim V = dim U + dim W . (14)
Proof. Let {u1, u2, . . . , ur}be a basis ofU and {w1, w2, . . . , ws }be a basis
ofW. Then every u U has a unique representation
u = 1u1 + 2u2 + + rur, i F ,
and every w W has a unique representation
w = 1w1 + 2w2 + + s ws , i F .
Since V = U W, every v V has a unique representation
v = 1u1 + 2u2 + + rur + 1w1 + 2w2 + + s ws .
Therefore {u1, u2, . . . , ur, w1, w2, . . . , ws } is a basis ofV with dimension
r + s.
2 Mappings
Let X and Y be sets and A X a subset (ofX). A mapping T from A into
Y associates with each x A a single y Y called the image ofx under
T. We write y = T x. The set A is called the domain of definition of T,
or simply the domain of T, denoted by D(T ). Notationally, we write
T: D(T ) Y .
The range of T, denoted by R(T ), is the set of all images:
R(T ) = {y Y|y = T x for some x D(T )}.
Note that D(T ) is not necessarily the whole Xand R(T ) is not necessarily
the whole Y.
If R(T ) is the whole Y, then T is said to be onto, or surjective.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
10/27
3 LINEAR OPERATORS 10
There may be more than one element in D(T ) that is mapped to a
single element in R(T ):
x1
x2
y = T x1 = T x2
T
T
Ifx1 x2 implies T x1 T x2 for every x1, x2 D(T ), then T is called
one-to-one or injective.
Given a y R(T ), the inverse image ofy is the set of all x D(T )
such that T x = y. For an injective mapping T: D(T ) Y, we can define
a mapping T1 : R(T ) D(T ), called the inverse of T, such that y
R(T ) is mapped (by T1) to that x D(T ) for which T x = y. However,
an inverse mapping cannot be defined if T is not injective. (Why?)
3 Linear Operators
The following figure shows two functions y = f(x) = ax and y =
g(x) = bx2 where a and b are real constants:
g(x)
f(x)
y
x0
C.P. Kwong
-
8/3/2019 01 Linear Algebra
11/27
3 LINEAR OPERATORS 11
Let x = 1. Then f(x) = f (1) = a and g(x) = g(1) = b. Next, let
x = 3. Then f(x) = f (3) = 3a and g(x) = g(3) = 9b. Suppose now
x = 1 + 3 = 4,
we have
f(x) = f (4) = 4a = f (1) + f (3).
However,
g(x) = g(4) = 16b g(1) + g(3).
The function f(x) is "linear" in this sense. "Linear operators" are map-
pings between vector spaces, which possess a similar linear property.
Definition 9. Let T be a mapping with domain D(T ) and range R(T ). T
is a linear operator ifD(T ) is a vector space over F and R(T ) is a subset
of a vector space also over F. Moreover, for any x, y D(T ) and any
F,
T (x + y) = T x + T y (15)
and
T(x) = Tx. (16)
Example 6. Let A be an m n matrix with elements ai,j F where F is
a field. The mapping defined by T x = Ax,x Fn and Ax is the usual
matrix multiplication, is a linear operator from Fn into Fm.
Example 7. Let x(t) C[a,b] be a continuous function from [a, b] into
R. Define a mapping T: C[a,b] Y as follow:
y(t) = T x =
ta
x( ) d , t [a, b].
From the theory of integration, y(t) is also a continuous function overt [a,b]. Moreover, for any x, y C[a,b], , F,t
a[x() + y()]d =
ta
x()d +
ta
y ( ) d , t [a, b].
Therefore T is a linear operator from C[a,b] into itself, i.e., T: C[a,b]
C[a,b].
C.P. Kwong
-
8/3/2019 01 Linear Algebra
12/27
3 LINEAR OPERATORS 12
Definition 10. The null space of a linear operater T, denoted by N(T ),
is the set of all x D(T ) such that T x = 0, where 0 is the zero vector ofR(T ).
Theorem 4. Let T be a linear operator. Then R(T ) and N(T ) are vector
spaces.
Proof. Let y1, y2 be any two vectors in R(T ) and , be any two scalars
in F. Then there must exist two vectors x1 and x2 in D(T ) such that
T x1 = y1 and T x2 = y2. Since T is linear,
T(x1 + x2) = T x1 + T x2 = y1 + y2.
Therefore y1 + y2 is in R(T ). This shows that R(T ) is a vector space.
For any x1, x2 N(T ), T x1 = T x2 = 0 by definition. The linear
combination ofx1 and x2, being x1+x2, is also in N(T ) since T(x1 +
x2) = T x1 + T x2 = 0. This shows that N(T ) is a vector space.
Example 8. Given
2 21 1x
y = 0
0which leads to the simultaneous equations
2x 2y = 0;
x y = 0.
The solution gives the null space represented by x = y:
x = y
0
y
x
T
C.P. Kwong
-
8/3/2019 01 Linear Algebra
13/27
3 LINEAR OPERATORS 13
Theorem 5. Let T be a linear operator. If D(T ) is finite-dimensional,
thendim D(T ) = dim R(T ) + dim N(T). (17)
Proof. Suppose N(T ) is k-dimensional. Then there are k vectors 1, 2, . . . , k
that form a basis for N(T ). Suppose dim D(T ) = n. Then there are lin-
early independent vectors k+1, k+2, . . . , n in D(T ) such that
{1, 2, . . . , k, k+1, k+2, . . . , n} is a basis for D(T ). We shall prove that
{T k+1, T k+2, . . . , T n} is a basis for R(T ).
The vectors T 1, T 2, . . . , T n certainly span R(T ), i.e., any vector in
R(T ) is a linear combination ofT i, i = 1, 2, . . . , n. However, since i, 1 i k, are in N(T ), T i = 0 for 1 i k. It follows that R(T ) is indeed
spanned by a smaller set of vectors T k+1, T k+2, . . . , T n. It remains to
show that they are linearly independent.
Suppose there are scalars i F such that
ni=k+1
i(T i) = 0,
i.e., T i, i = k + 1, k + 2, . . . , n, are linearly dependent. Since T is linear,
we have
Tn
i=k+1
ii = 0
and hence the vector x =n
i=k+1 ii is in N(T ). Since {1, 2, . . . , k} is
a basis for N(T ), there must exist scalars 1, 2, . . . , k such that
x =
ki=1
ii.
It follows thatk
i=1
ii
n
i=k+1
ii = 0.
Since 1, 2, . . . , n are linearly independent, we must have
1 = 2 = = k = k+1 = k+2 = n = 0.
This shows that T k+1, T k+2, . . . , T n are linearly independent and hence
form a basis for R(T ). Since dim N(T ) = k and dim D(T ) = n, the
theorem follows.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
14/27
3 LINEAR OPERATORS 14
The inverse of an operator is an important concept in both theory and
application. Suppose we are given y and T, we ask: what is x such thatT x = y?
T
T1
yx
Definition 11. The inverse of a linear operator T: D(T ) R(T ), if ex-
ists, is the mapping T1 : R(T ) D(T ) such that for every y R(T ),
T1y = x, where x D(T ) and T x = y.
The following is an existence theorem for inverse.
Theorem 6. Given a linear operator T. T1 : R(T ) D(T ) exists if and
only if
T x = 0 =
impliesx = 0. (18)
Moreover, T1 is linear.
Proof. It is easy to see that T1 exists if T is one-to-one, i.e., for any
x1, x2 D(T ),
T x1 = T x2 = x1 = x2.
We first prove the "if" part, i.e.,
T x = 0 = x = 0 = T1 exists .
Suppose there are x1 and x2 such that T x1 = T x2. Since T is linear,
T (x1 x2) = T x1 T x2 = 0.
But (x1 x2) D(T ) and T (x1 x2) = 0 implies x1 x2 = 0 (i.e., the
assumption) or x1 = x2. Therefore T x1 = T x2 implies x1 = x2 and T1
exists.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
15/27
3 LINEAR OPERATORS 15
Next, we prove the "only if" part, i.e.,
T x = 0 = x = 0 = T1 exists .
Let x1 be the zero vector in D(T ). Since T is linear, T0 = T (x + (x)) =
T x T x = 0 where x D(T ). It follows that T x1 = 0. Suppose there is
another vector x2 0 such that T x2 = 0. Then we can write
T x1 = T x2 = 0.
However, if T1 exists, then, for any x1, x2 D(T ), T x1 = T x2 implies
x1 = x2. Consequently, x1 = x2 = 0, i.e., the zero vector is the uniquevector x such that T x = 0.
Finally, let x1, x2 D(T ) and write y1 = T x1 and y2 = T x2. Then
y1, y2 R(T ). IfT1 exists, we have x1 = T
1y1 and x2 = T1y2. Since
T is linear, for any , F,
y1 + y2 = T x1 + T x2 = T(x1 + x2).
Thus
T1(y1 + y2) = T1T(x1 + x2) = x1 + x2,
or
T1(y1 + y2) = T1y1 + T
1y2.
This proves that T1 is linear.
Corollary 1. If D(T ) is finite-dimensional and T1 exists, dim R(T ) =
dim D(T ).
Proof. We have proved that
dim D(T ) = dim R(T ) + dim N(T )
and that T1 exists implies
T x = 0 = x = 0.
Therefore N(T ) contains only the zero vector and hence dimN(T ) = 0.
The corollary follows.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
16/27
3 LINEAR OPERATORS 16
We show in the following that there is a matrix associated with a linear
operator, which depends on the choice of bases of D(T ) and R(T ).Let T: D(T ) R(T ) be a linear operator. Suppose {1, 2, . . . , n} is
a basis for D(T ) and {1, 2, . . . , m} is a basis of R(T ). Then a vector
x D(T ) can be expressed as
x =
ni=1
ii, i F . (19)
Note that [1, 2, . . . , n] is the coordinate vector ofx relative to the basis
{1, 2 . . . , n}. Since T is linear, we have
T x =
ni=1
iT i, i F . (20)
Let y = T x. Clearly y R(T ) and we can write
y =
mi=1
ii, i F (21)
where [1, 2, . . . , m] is the coordinate vector ofy relative to the basis
{1, 2, . . . , m}. A matrix A arises naturally as representing the opera-
tion of T on the vectors 1, 2, . . . , n:
T 1 = a111 + a212 + + am1m,...
T n = a1n1 + a2n2 + + amnm,
aij F , (22)
or
[T 1 T 2 T n] = [1 2 m]
a11 a12 a1n
a21 a22 a2n...
am1 am2 amn
.
A
(23)
Since y = T x, we have
[1 2 m]
1
2...
m
= [T 1 T 2 T n]
1
2...
n
. (24)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
17/27
3 LINEAR OPERATORS 17
Therefore
[1 2 m]
1
2...
m
= [1 2 m]A
1
2...
n
, (25)
which gives
1
2...
m
= A
1
2...
n
. (26)
We see that multiplying the coordinator vector of x D(T ) by A gives
the coordinator vector of y = T x in R(T ). A is called the matrix repre-
sentation ofT. Note however that A depends on the choice of the bases
{i} and {i}.
Example 9. A linear operator T effects the following mapping of two
vectors in R2:
T
T
x1x2
(3, 6)
(2, 1)(2, 1)
(1, 4)
We have, in the basis formed by (1, 0) and (0, 1),
x1 = 2(1, 0) + 1(0, 1).
C.P. Kwong
-
8/3/2019 01 Linear Algebra
18/27
3 LINEAR OPERATORS 18
Similarly,
T x1 = 1(1, 0) 4(0, 1)
in the same basis. We then have, for the mapping T x1,a11 a12
a21 a22
2
1
=
1
4
,
and for the mapping T x2,
a11 a12
a21 a22
2
1
=
3
6
.
Hence
2a11 + a12 = 1,
2a21 + a22 = 4,
2a11 + a12 = 3,
2a21 + a22 = 6.
Solving gives
A = a11 a12
a21 a22 = 1 1
2.5 1 .
Suppose a new basis is chosen and the matrix which relates the coor-
dinate vectors in the old and the new bases is
P =
2 1
1 1
.
Thus the new coordinate vectors for
2
1
and
1
4
are
P
2
1
=
3
1
and P
1
4
=
2
3
.
We ask what is the matrix B such that
B
3
1
=
2
3
,
C.P. Kwong
-
8/3/2019 01 Linear Algebra
19/27
3 LINEAR OPERATORS 19
i.e., what is the matrix representation of the linear operator T in the new
basis?
Since P
2
1
=
3
1
and P must be nonsingular (why?) and hence
P1 exists, we have
P1P
2
1
= P1
3
1
=
2
1
= P1
3
1
.
It follows that
A2
1 = AP1
3
1 . (27)Similarly,
P1
2
3
=
1
4
. (28)
But
A
2
1
=
1
4
and then (27) can be rewritten as
AP13
1
= 14
. (29)Substituting (28) into (29) gives
AP1
3
1
= P1
2
3
,
or
P AP1 3
1 =
2
3 .
Therefore, we obtain
B = P AP1 =
2 1
1 1
1 1
2.5 1
1 1
1 2
=
1.5 2.5
1.5 1.5
.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
20/27
3 LINEAR OPERATORS 20
It is easy to check that 1.5 2.5
1.5 1.5
3
1
=
2
3
.
Example 10. Let T: D(T ) R(T ) be a linear operator where D(T ) and
R(T ) are in the same vector space X over F. An eigenvalue of T is a
scalar F such that there is a nonzero vector g D(T ) with T g = g;
g is called the eigenvector ofT associated with .
Suppose dim D(T ) = n is finite and suppose we can find n linearlyindependent eigenvectors of T, g1, g2, . . . , gn. Then {gi} is a basis for
D(T ) and every x D(T ) can be written as
x =
ni=1
igi, i F .
Then
T x =
ni=1
iT gi =
ni=1
iigi =
ni=1
igi
where i = ii F. That means, relative to the same basis {gi}, the
coordinator vector of T x, [1 2 n], is related to the coordinator
vector of x, [1 2 n], by
1
2...
n
= A
1
2...
n
,
where A has a simple form in this basis of eigenvectors:
A =
1 0. . .
0 n
.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
21/27
4 SYMMETRIC MATRICES AND QUADRATIC FORMS 21
4 Symmetric Matrices and Quadratic Forms
The transpose of a matrix A, denoted as AT, is obtained by taking as its
ith column the ith row of A.
Example 11.
1. A1 =
1 2 3
1 0 4
, AT1 =
1 1
2 0
3 4
.
2. A2 =1 1
4 2
, AT2 = 1 41 2
.
Let A1 and A2 be given by the above two examples. We construct
B1 = A1AT1 =
1 2 3
1 0 4
1 1
2 0
3 4
=
14 11
11 17
,
B2 = AT1A1 =
1 1
2 0
3 4
1 2 31 0 4
= 2 2 1
2 4 6
1 6 25
,
B3 = A2AT2 =
2 2
2 20
,
B4 = AT2A2 =
17 7
7 5
.
We observe two interesting properties of all matrices B1 to B4. First, they
are all square. Second,
BT1 = B1, BT2 = B2, B
T3 = B3, B
T4 = B4.
In other words, every one of these matrices equal to its respective trans-
pose. We call this kind of matrices symmetric.
The following properties of a transpose are easy to prove:
C.P. Kwong
-
8/3/2019 01 Linear Algebra
22/27
4 SYMMETRIC MATRICES AND QUADRATIC FORMS 22
1. (AT)T = A;
2. (kA)T = kAT, k is a constant;
3. (A + B)T = AT + BT;
4. (AB)T = BTAT;
5. (A1)T = (AT)1.
We prove in the following the last property.
Since AA1 = I and A1A = I where I is the identity matrix, we have
(AA1)T = (A1)TAT = IT = I
and
(A1A)T = AT(A1)T = IT = I.
That means (A1)T is the inverse of AT.
Now it is easy to see why AAT and ATA are necessarily symmetric,
because
(AAT)T = (AT)TAT = AAT,
(ATA)T = AT(AT)T = ATA.
A symmetric matrix is necessarily square since the requirement AT =
A implies that A and AT must have the same dimension. However, AT
and A can have the same dimension only if A is square. It follows that
we can talk about A1 if A is symmetric. The inverse may not exist. For
example,
A =
1 2
2 4
is symmetric but it has no inverse (since det A = 0). However, if theinverse of a symmetric matrix exists, it must also be symmetric. This is
because (A1)T = (AT)1 and ifA is symmetric, AT = A, or (A1)T = A1.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
23/27
4 SYMMETRIC MATRICES AND QUADRATIC FORMS 23
Let x Rn1 be a column vector and A Rnn be a symmetric matrix,
then xTAx is a scalar q given by
q = xTAx =
x1 x2 xn
a11 a12 a1n
a21 a22 a2n...
...
an1 an2 ann
x1
x2...
xn
= a11x21 + a12x1x2 + a21x2x1 + + annx
2n
=
n
i=1n
j=1aijxixj .
(30)
We call q = xTAx a quadratic form.
Example 12.
q =
x1 x2 1 2
2 2
x1
x2
= x21 4x1x2 + 2x22 .
Notice that, for x 0, q can be positive (e.g., when x =
1
1
) ornegative
(e.g., when x =
1
1
).
A symmetric matrix A is positive definite if
x 0 = xTAx > 0, (31)
and positive semidefinite if
x 0 = xTAx 0. (32)
A very broad class of matrices are automatically positive definite.
Theorem 7. Let B be an m n matrix and let A be given by
A = BTB.
Then A is positive semidefinite. If furthermore the n columns of B are
linearly independent, then A is positive definite.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
24/27
5 SOLUTION OF ALGEBRAIC EQUATIONS 24
Proof. Let x Rn1 be nonzero and y = Bx . Thus y is an m 1 vector.
We have shown before that BTB is symmetric. Furthermore,
xTAx = xTBTBx = yTy =
mi=1
y2i 0.
Hence A is positive semidefinite. Since y is the linear combination of the
n columns ofB with x1, x2, . . . , xn as scalar multipliers, if the n columns
ofB are linearly independent, y = Bx = 0 if and only ifx = 0. Now x 0
implies y 0. Therefore xTAx = yTy > 0, and A is positive definite.
We learned that g 0 is an eigenvector of A associated with the
eigenvalue if Ag = g.
Lemma 1. Let be an eigenvalue of A and g its associated eigenvector.
Then
=gTAg
gTg. (33)
Proof. We have Ag = g, g 0, and gTAg = gTg. This gives
=gTAg
gTg.
Theorem 8. The eigenvalues of a positive definite matrix are positive and
the eigenvalues of a positive semidefinite matrix are nonnegative.
Proof. Since =gTAg
gTgand gTg > 0, if gTAg > 0, > 0. Similarly, 0
if gTAg 0. Note that gTg > 0 is true because we can prove that the
eigenvectors of a symmetric matrix are all real.
5 Solution of Algebraic Equations
The solution of the system of algebraic equations
a11x1 + a12x2 + + a1nxn = y1,
a21x1 + a22x2 + + a2nxn = y2,...
am1x1 + am2x2 + + amnxn = ym
(34)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
25/27
5 SOLUTION OF ALGEBRAIC EQUATIONS 25
is identical to the solution of the matrix equation
Ax = y, (35)
where
A =
a11 a12 a1n
a21 a22 a2n...
am1 am2 amn
is a given m n matrix, y = y1 y2 ymT
is a given vector, and
x =
x1 x2 xnT is an unknown to be found.
The solution of (35) is closely related to the "rank" of the matrix A,
which is defined as follows.
Definition 12. For an m n matrix A, its m rows generate a subspace
and the dimension of this subspace is called the row rank ofA. Similarly,
the dimension of the subspace generated by the columns of A is called
the column rank ofA.
It can be shown that the row rank and the column rank of a matrix isidentical. Thus we can simply say A has the rank r, or write rank(A) = r.
Clearly r min(m,n). When r = min(m, n), A is said to have full rank.
Example 13. For the matrix
A =
1 1 1 1
1 2 3 4
2 4 6 8
,
rank(A) = 2. This is because the number of independent rows is two.
Returning to the solution of (35). If m = n and A is of full rank,
det A 0. It follows that A1 exists and A1A = AA1 = I, I is the
identity matrix. In this case, x is simply given by
x = A1y (36)
C.P. Kwong
-
8/3/2019 01 Linear Algebra
26/27
5 SOLUTION OF ALGEBRAIC EQUATIONS 26
since, multiplying both sides of (35) by A1 gives
A1Ax = A1y, (37)
of which the left-hand side is nothing but Ix = x.
When A is not of full rank, or m n, the solution of (35) is not so
simple. First, consider its existence.
Theorem 9. The matrix equation (35) has a solution if and only if
rank([A|y]) = rank(A)
where [A|y] is the augmented matrix formed by A and y.
Proof. Suppose there exists an x such that Ax = y. Then y is the linear
combination of the columns ofA and hence lies in Im A (the image ofA).
It follows that Im [A|y] = Im A. Since the dimension of Im A (which is a
vector space) equals to the rank of A, we have rank([A|y]) = rank(A).
Conversely, we note that Im A Im [A|y]. Hence if rank([A|y]) =
rank(A), Im A and Im [A|y] have the same dimension and are therefore
equal, i.e., they are the same vector space. Hence y Im A, i.e., y is the
linear combination of the columns of A. That means y = Ax for some
x Rn.
Corollary 2. If rank(A) = m, then (35) always has a solution.
Proof. Since the columns of A and [A|y] are m-dimensional vectors,
rank(A) rank([A|y]) m. If rank(A) = m, then m = rank(A)
rank([A|y]) m implies rank(A) = rank([A|y]). Hence (35) has a solu-
tion.
The following theorem answers the question ofuniquenessof a solu-
tion.Theorem 10. Let x0 be a solution of (35). Then the set of all solutions is
x0 + N(A)
where N(A) is the null space of A.
e.g., A =
1 2
3 4
, y =
5
6
, then [A|y] =
1 2 5
3 4 6
.
C.P. Kwong
-
8/3/2019 01 Linear Algebra
27/27
5 SOLUTION OF ALGEBRAIC EQUATIONS 27
Proof. Ifu is a solution of (35), then Au = Ax0 or A(ux0) = 0, where 0 is
the zero vector. Hence (ux0) N(A) and u = x0+(ux0) x0+N(A).Conversely ifu x0 +N(A), then u = x0 +z for some z satisfying Az = 0.
Hence Au = Ax0 + Az = Ax0 = y.
Thus, given a solution of (35), adding to this solution any vector in
the null space ofA results in another solution. This proves the following
result.
Corollary 3. A solution of (35) is unique if and only if N(A) contains
only the zero vector.
The homogeneous equation Ax = 0 always has the trivial solution
x = 0. From the uniqueness Theorem 10, the set of all solutions is the
subspace 0 + N(A) = N(A). Thus Ax = 0 has a nontrivial solution if
and only if N(A) is non-empty, the latter is true if n > m (prove it!). It
follows that Ax = 0 always has a nontrival solution if n > m.