01 linear algebra

Upload: xie-hong-wei

Post on 06-Apr-2018

222 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 01 Linear Algebra

    1/27

    1 VECTOR SPACES 1

    Linear Algebra

    1 Vector Spaces

    The multiplication of a vector by a constant and the addition of two

    vectors are familiar ideas; their abstraction and generalization lead to

    the concept of vector spaces.

    x

    x

    2x

    y

    x + y

    Definition 1. A vector space is a set V of elements called vectors satis-

    fying the following axioms.

    1. For every x , y , z V, there is an operation called vector addition,

    such that

    (a) x + y = y + x (commutative);

    (b) x + (y + z) = (x + y) + z (associative);

    (c) there exists in V a unique vector 0 (called the zero vector) such

    that x + 0 = x for every x V;

    (d) for every x V there exists a unique vector x such that

    x + (x) = 0.

    2. For every x, y V and every , F, where F is a field, there is

    an operation called scalar multiplication, such that

    (a) (x) = ()x (associative);

    (b) 1x = x for every x V, where 1 is the unit element ofF under

    multiplication;

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    2/27

    1 VECTOR SPACES 2

    (c) (x + y) = x + y (distributive with respect to vector addi-

    tion);

    (d) ( + )x = x + x (distributive with respect to scalar addi-

    tion).

    Remark. Without giving a formal definition of a field, we simply note that

    the set of all real numbers, denoted as R, is a field equipped with the

    usual arithmetics of addition/subtraction and multiplication/division.

    The unit element of this field is exactly the number "1". The set of all

    complex numbers is another example of field. Sometimes we say "V is

    a vector space over F" to emphasize the relationship between V and its

    underlying field F.

    Example 1. (The n-tuple space, Fn.) Let F be any field and let V be

    the set of all n tuples x = (u1, u2, . . . , un) of scalars ui F. If y =

    (w1, w2, . . . , wn) with wi F, define the addition ofx and y as

    x + y = (u1 + w1, u2 + w2, . . . , un + wn)

    and the multiplication of x by a scalar F as

    x = (u1, u2, . . . , un).

    It can be proved that the defined operations satisfy the axioms of a vector

    space and hence V is a vector space over F.

    Example 2. (The space of m n matrices, Fmn.) Let F be any field and

    m, n be positive integers. The set of all m n matrices with elementsin F is a vector space under the usual matrix addition and multiplication

    of a matrix by a scalar.

    Example 3. (The space of continuous functions, C[a,b].) Let V be the

    set of all real-valued, continuous functions of t, t [a,b].

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    3/27

    1 VECTOR SPACES 3

    x(t)

    y(t)

    ta b

    Define, for x, y V and R, the following operations:

    x + y = x(t) + y(t),

    x = x(t),

    where addition of two continuous functions and multiplication of a con-

    tinuous function by a real scalar are defined in the usual point-wise man-

    ner. Then, V is a vector space overR. Note that ifx and y are continuous

    real-valued functions over [a, b] and is real, so are x + y and x.

    Definition 2. A vector x V is said to be a linear combination of the vec-

    tors y1, y2, . . . , y n V provided that there exist scalars 1, 2, . . . , n

    F such that

    x = 1y1 + 2y2 + + nyn =

    ni=1

    iyi. (1)

    Definition 3. Let V be a vector space over F. The distinct vectors x1, x2, . . . ,

    xn V are said to be linearly dependent if there exist scalars 1, 2, . . . ,

    n F, not all of which are zero, such that

    n

    i=1 ixi

    =0. (2)

    The vectors that are not linearly dependent are linearly independent.

    Remark. x1, x2, . . . , xn V are linearly dependent implies that every xi

    in the set {x1, x2, . . . , xn} can be expressed as the linear combination of

    the remaining vectors in the set.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    4/27

    1 VECTOR SPACES 4

    Given the following two vectors on a plane:

    12

    It seems that any vector x on this plane can be written as x = 11 + 22

    for some 1, 2 R. For example, in the following diagram, x = 0.71

    0.52.

    12 0.71

    0.52

    x

    However, the following 1, 2 cannot perform the same function:

    1

    2

    Definition 4. A basis in a vector space V is a set of linearly independent

    vectors such that every vector in V is a linear combination of this set of

    vectors. The number of vectors that constitutes a basis is called the di-

    mension ofV, denoted dim V. If dim V is finite, V is a finite-dimensionalvector space.

    Definition 5. A set of vectors {i} is said to span a vector space V if every

    vector in V can be written as a linear combination of{i}. (Note that {i}

    may not be linearly independent and hence may not be a basis.)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    5/27

    1 VECTOR SPACES 5

    Given a basis {1, 2, . . . , n} for an n-dim vector space V, a vector

    x V can be written as

    x = 11 + 22 + + nn, i F (3)

    [1 2 n] is called the coordinate vector of x in the basis formed

    by {1, 2, . . . , n}.

    A vector is "free", "floating", if no basis is specified:

    x

    The vector is "fixed" whenever a basis is given:

    x

    1

    2

    It is obvious that there exist many bases for a given vector space and

    different coordinator vectors result from different bases for the same

    x. What are the relationships between these coordinator vectors? The

    answer is: Two coordinator vectors are related by a coordinate transfor-

    mation effected by an n n matrix with elements in F.

    Let {1, 2, . . . , n} be a set of basis for V and {1,

    2, . . . ,

    n} be an-

    other set of basis for the same V. Thus a vector x V can be written

    either as

    x = 11 + 22 + + nn, i F , (4)

    or

    x = 11 +

    2

    2 + +

    n

    n,

    i F . (5)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    6/27

    1 VECTOR SPACES 6

    Since 1, 2, . . . ,

    n are vectors, we can write

    1 = a111 + a212 + + an1n,

    ...

    n = a1n1 + a2n2 + + annn,

    aij F , (6)

    i.e.,

    [1 n] = [1 n]

    a11 a12 a1n

    a21 a22 a2n...

    an1 an2 ann

    = [1 n]A,

    (7)

    where A is an n n matrix of scalars in F. However, since

    x = [1 n]

    1...

    n

    = [1 n]

    1...

    n

    , (8)

    therefore

    [1 n]A

    1...

    n

    = [1 n]

    1...

    n

    . (9)

    This last equation holds for any set of basis vectors {1, 2, . . . , n} and

    hence

    1...

    n

    = A

    1...

    n

    . (10)

    Thus the matrix A acts as a coordinate transformation that relates the

    coordinates in the two bases for the same vector.

    Definition 6. A subset W of a vector space V is a subspace of V if for

    every pair x and y of vectors contained in W, every linear combination

    x + y, , F, is also contained in W.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    7/27

    1 VECTOR SPACES 7

    Example 4. In the following figure, W1 is a 1-dimensional subspace and

    W2 is a 2-dimensional subspace, ofR3.

    W1

    W2

    It can be shown using its definition, that a subspace is itself a vector

    space. Moreover, any subspace must contain the zero vector since for

    any x W, x x is by definition in W.

    Theorem 1. Let V be a vector space over the field F. The intersection of

    any collection of subspaces of V is a subspace of V.

    Proof. Let {Wi} be a collection of subspaces of V and W =

    i Wi their

    intersection. Since each Wi is a subspace containing the zero vector 0,

    0 W. Let x, y W and , F. By the definition of W, x, y Wi for

    all i. Because each Wi is a subspace, (x + y) Wi for all i. Therefore

    x + y is again in W and W is a subspace by definition.

    Definition 7. Let U and W be subspaces of a vector space V over F. The

    sum ofU and W, denoted by U+ W, is the subset of V considering of all

    sums u + w with u U and w W.

    Remark. It is easy to show that U + W is a subspace of V.

    Definition 8. A vector space V is said to be the direct sum of two sub-

    spaces U and W, written as

    V = U W , (11)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    8/27

    1 VECTOR SPACES 8

    if each v V has a unique representation

    v = u + w, u U , w W . (12)

    Moreover, W is called the algebraic complement ofU in V and vice versa.

    Example 5. In the following figure V = R2 and U, W1, and W2 are sub-

    spaces of V.

    W1 W2

    U

    R2

    We have

    V = U W1

    = U W2,(13)

    and W1 is the "orthogonal" complement of U.

    Theorem 2. Let V be a vector space over F, and U and W be subspaces

    ofV. IfU + W = V and U W = 0, then V is the direct sum of U and W.

    Proof. Given any v V. Since U + W = V, there exist u U and w W

    such that v = u + w . Suppose there exist u U and w W such that

    v = u + w. Then

    u + w = u + w.

    It follows that

    u u = w w.

    But u u U and w w W and therefore u u = w w U W.

    Since U W = 0, u u = 0 and w w = 0 and hence u = u and

    w = w. That means the representation v = u + w is unique. The

    theorem is proved.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    9/27

    2 MAPPINGS 9

    Theorem 3. Let V be a finite-dimensional vector space over F. IfV is the

    direct sum of U and W, then

    dim V = dim U + dim W . (14)

    Proof. Let {u1, u2, . . . , ur}be a basis ofU and {w1, w2, . . . , ws }be a basis

    ofW. Then every u U has a unique representation

    u = 1u1 + 2u2 + + rur, i F ,

    and every w W has a unique representation

    w = 1w1 + 2w2 + + s ws , i F .

    Since V = U W, every v V has a unique representation

    v = 1u1 + 2u2 + + rur + 1w1 + 2w2 + + s ws .

    Therefore {u1, u2, . . . , ur, w1, w2, . . . , ws } is a basis ofV with dimension

    r + s.

    2 Mappings

    Let X and Y be sets and A X a subset (ofX). A mapping T from A into

    Y associates with each x A a single y Y called the image ofx under

    T. We write y = T x. The set A is called the domain of definition of T,

    or simply the domain of T, denoted by D(T ). Notationally, we write

    T: D(T ) Y .

    The range of T, denoted by R(T ), is the set of all images:

    R(T ) = {y Y|y = T x for some x D(T )}.

    Note that D(T ) is not necessarily the whole Xand R(T ) is not necessarily

    the whole Y.

    If R(T ) is the whole Y, then T is said to be onto, or surjective.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    10/27

    3 LINEAR OPERATORS 10

    There may be more than one element in D(T ) that is mapped to a

    single element in R(T ):

    x1

    x2

    y = T x1 = T x2

    T

    T

    Ifx1 x2 implies T x1 T x2 for every x1, x2 D(T ), then T is called

    one-to-one or injective.

    Given a y R(T ), the inverse image ofy is the set of all x D(T )

    such that T x = y. For an injective mapping T: D(T ) Y, we can define

    a mapping T1 : R(T ) D(T ), called the inverse of T, such that y

    R(T ) is mapped (by T1) to that x D(T ) for which T x = y. However,

    an inverse mapping cannot be defined if T is not injective. (Why?)

    3 Linear Operators

    The following figure shows two functions y = f(x) = ax and y =

    g(x) = bx2 where a and b are real constants:

    g(x)

    f(x)

    y

    x0

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    11/27

    3 LINEAR OPERATORS 11

    Let x = 1. Then f(x) = f (1) = a and g(x) = g(1) = b. Next, let

    x = 3. Then f(x) = f (3) = 3a and g(x) = g(3) = 9b. Suppose now

    x = 1 + 3 = 4,

    we have

    f(x) = f (4) = 4a = f (1) + f (3).

    However,

    g(x) = g(4) = 16b g(1) + g(3).

    The function f(x) is "linear" in this sense. "Linear operators" are map-

    pings between vector spaces, which possess a similar linear property.

    Definition 9. Let T be a mapping with domain D(T ) and range R(T ). T

    is a linear operator ifD(T ) is a vector space over F and R(T ) is a subset

    of a vector space also over F. Moreover, for any x, y D(T ) and any

    F,

    T (x + y) = T x + T y (15)

    and

    T(x) = Tx. (16)

    Example 6. Let A be an m n matrix with elements ai,j F where F is

    a field. The mapping defined by T x = Ax,x Fn and Ax is the usual

    matrix multiplication, is a linear operator from Fn into Fm.

    Example 7. Let x(t) C[a,b] be a continuous function from [a, b] into

    R. Define a mapping T: C[a,b] Y as follow:

    y(t) = T x =

    ta

    x( ) d , t [a, b].

    From the theory of integration, y(t) is also a continuous function overt [a,b]. Moreover, for any x, y C[a,b], , F,t

    a[x() + y()]d =

    ta

    x()d +

    ta

    y ( ) d , t [a, b].

    Therefore T is a linear operator from C[a,b] into itself, i.e., T: C[a,b]

    C[a,b].

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    12/27

    3 LINEAR OPERATORS 12

    Definition 10. The null space of a linear operater T, denoted by N(T ),

    is the set of all x D(T ) such that T x = 0, where 0 is the zero vector ofR(T ).

    Theorem 4. Let T be a linear operator. Then R(T ) and N(T ) are vector

    spaces.

    Proof. Let y1, y2 be any two vectors in R(T ) and , be any two scalars

    in F. Then there must exist two vectors x1 and x2 in D(T ) such that

    T x1 = y1 and T x2 = y2. Since T is linear,

    T(x1 + x2) = T x1 + T x2 = y1 + y2.

    Therefore y1 + y2 is in R(T ). This shows that R(T ) is a vector space.

    For any x1, x2 N(T ), T x1 = T x2 = 0 by definition. The linear

    combination ofx1 and x2, being x1+x2, is also in N(T ) since T(x1 +

    x2) = T x1 + T x2 = 0. This shows that N(T ) is a vector space.

    Example 8. Given

    2 21 1x

    y = 0

    0which leads to the simultaneous equations

    2x 2y = 0;

    x y = 0.

    The solution gives the null space represented by x = y:

    x = y

    0

    y

    x

    T

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    13/27

    3 LINEAR OPERATORS 13

    Theorem 5. Let T be a linear operator. If D(T ) is finite-dimensional,

    thendim D(T ) = dim R(T ) + dim N(T). (17)

    Proof. Suppose N(T ) is k-dimensional. Then there are k vectors 1, 2, . . . , k

    that form a basis for N(T ). Suppose dim D(T ) = n. Then there are lin-

    early independent vectors k+1, k+2, . . . , n in D(T ) such that

    {1, 2, . . . , k, k+1, k+2, . . . , n} is a basis for D(T ). We shall prove that

    {T k+1, T k+2, . . . , T n} is a basis for R(T ).

    The vectors T 1, T 2, . . . , T n certainly span R(T ), i.e., any vector in

    R(T ) is a linear combination ofT i, i = 1, 2, . . . , n. However, since i, 1 i k, are in N(T ), T i = 0 for 1 i k. It follows that R(T ) is indeed

    spanned by a smaller set of vectors T k+1, T k+2, . . . , T n. It remains to

    show that they are linearly independent.

    Suppose there are scalars i F such that

    ni=k+1

    i(T i) = 0,

    i.e., T i, i = k + 1, k + 2, . . . , n, are linearly dependent. Since T is linear,

    we have

    Tn

    i=k+1

    ii = 0

    and hence the vector x =n

    i=k+1 ii is in N(T ). Since {1, 2, . . . , k} is

    a basis for N(T ), there must exist scalars 1, 2, . . . , k such that

    x =

    ki=1

    ii.

    It follows thatk

    i=1

    ii

    n

    i=k+1

    ii = 0.

    Since 1, 2, . . . , n are linearly independent, we must have

    1 = 2 = = k = k+1 = k+2 = n = 0.

    This shows that T k+1, T k+2, . . . , T n are linearly independent and hence

    form a basis for R(T ). Since dim N(T ) = k and dim D(T ) = n, the

    theorem follows.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    14/27

    3 LINEAR OPERATORS 14

    The inverse of an operator is an important concept in both theory and

    application. Suppose we are given y and T, we ask: what is x such thatT x = y?

    T

    T1

    yx

    Definition 11. The inverse of a linear operator T: D(T ) R(T ), if ex-

    ists, is the mapping T1 : R(T ) D(T ) such that for every y R(T ),

    T1y = x, where x D(T ) and T x = y.

    The following is an existence theorem for inverse.

    Theorem 6. Given a linear operator T. T1 : R(T ) D(T ) exists if and

    only if

    T x = 0 =

    impliesx = 0. (18)

    Moreover, T1 is linear.

    Proof. It is easy to see that T1 exists if T is one-to-one, i.e., for any

    x1, x2 D(T ),

    T x1 = T x2 = x1 = x2.

    We first prove the "if" part, i.e.,

    T x = 0 = x = 0 = T1 exists .

    Suppose there are x1 and x2 such that T x1 = T x2. Since T is linear,

    T (x1 x2) = T x1 T x2 = 0.

    But (x1 x2) D(T ) and T (x1 x2) = 0 implies x1 x2 = 0 (i.e., the

    assumption) or x1 = x2. Therefore T x1 = T x2 implies x1 = x2 and T1

    exists.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    15/27

    3 LINEAR OPERATORS 15

    Next, we prove the "only if" part, i.e.,

    T x = 0 = x = 0 = T1 exists .

    Let x1 be the zero vector in D(T ). Since T is linear, T0 = T (x + (x)) =

    T x T x = 0 where x D(T ). It follows that T x1 = 0. Suppose there is

    another vector x2 0 such that T x2 = 0. Then we can write

    T x1 = T x2 = 0.

    However, if T1 exists, then, for any x1, x2 D(T ), T x1 = T x2 implies

    x1 = x2. Consequently, x1 = x2 = 0, i.e., the zero vector is the uniquevector x such that T x = 0.

    Finally, let x1, x2 D(T ) and write y1 = T x1 and y2 = T x2. Then

    y1, y2 R(T ). IfT1 exists, we have x1 = T

    1y1 and x2 = T1y2. Since

    T is linear, for any , F,

    y1 + y2 = T x1 + T x2 = T(x1 + x2).

    Thus

    T1(y1 + y2) = T1T(x1 + x2) = x1 + x2,

    or

    T1(y1 + y2) = T1y1 + T

    1y2.

    This proves that T1 is linear.

    Corollary 1. If D(T ) is finite-dimensional and T1 exists, dim R(T ) =

    dim D(T ).

    Proof. We have proved that

    dim D(T ) = dim R(T ) + dim N(T )

    and that T1 exists implies

    T x = 0 = x = 0.

    Therefore N(T ) contains only the zero vector and hence dimN(T ) = 0.

    The corollary follows.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    16/27

    3 LINEAR OPERATORS 16

    We show in the following that there is a matrix associated with a linear

    operator, which depends on the choice of bases of D(T ) and R(T ).Let T: D(T ) R(T ) be a linear operator. Suppose {1, 2, . . . , n} is

    a basis for D(T ) and {1, 2, . . . , m} is a basis of R(T ). Then a vector

    x D(T ) can be expressed as

    x =

    ni=1

    ii, i F . (19)

    Note that [1, 2, . . . , n] is the coordinate vector ofx relative to the basis

    {1, 2 . . . , n}. Since T is linear, we have

    T x =

    ni=1

    iT i, i F . (20)

    Let y = T x. Clearly y R(T ) and we can write

    y =

    mi=1

    ii, i F (21)

    where [1, 2, . . . , m] is the coordinate vector ofy relative to the basis

    {1, 2, . . . , m}. A matrix A arises naturally as representing the opera-

    tion of T on the vectors 1, 2, . . . , n:

    T 1 = a111 + a212 + + am1m,...

    T n = a1n1 + a2n2 + + amnm,

    aij F , (22)

    or

    [T 1 T 2 T n] = [1 2 m]

    a11 a12 a1n

    a21 a22 a2n...

    am1 am2 amn

    .

    A

    (23)

    Since y = T x, we have

    [1 2 m]

    1

    2...

    m

    = [T 1 T 2 T n]

    1

    2...

    n

    . (24)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    17/27

    3 LINEAR OPERATORS 17

    Therefore

    [1 2 m]

    1

    2...

    m

    = [1 2 m]A

    1

    2...

    n

    , (25)

    which gives

    1

    2...

    m

    = A

    1

    2...

    n

    . (26)

    We see that multiplying the coordinator vector of x D(T ) by A gives

    the coordinator vector of y = T x in R(T ). A is called the matrix repre-

    sentation ofT. Note however that A depends on the choice of the bases

    {i} and {i}.

    Example 9. A linear operator T effects the following mapping of two

    vectors in R2:

    T

    T

    x1x2

    (3, 6)

    (2, 1)(2, 1)

    (1, 4)

    We have, in the basis formed by (1, 0) and (0, 1),

    x1 = 2(1, 0) + 1(0, 1).

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    18/27

    3 LINEAR OPERATORS 18

    Similarly,

    T x1 = 1(1, 0) 4(0, 1)

    in the same basis. We then have, for the mapping T x1,a11 a12

    a21 a22

    2

    1

    =

    1

    4

    ,

    and for the mapping T x2,

    a11 a12

    a21 a22

    2

    1

    =

    3

    6

    .

    Hence

    2a11 + a12 = 1,

    2a21 + a22 = 4,

    2a11 + a12 = 3,

    2a21 + a22 = 6.

    Solving gives

    A = a11 a12

    a21 a22 = 1 1

    2.5 1 .

    Suppose a new basis is chosen and the matrix which relates the coor-

    dinate vectors in the old and the new bases is

    P =

    2 1

    1 1

    .

    Thus the new coordinate vectors for

    2

    1

    and

    1

    4

    are

    P

    2

    1

    =

    3

    1

    and P

    1

    4

    =

    2

    3

    .

    We ask what is the matrix B such that

    B

    3

    1

    =

    2

    3

    ,

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    19/27

    3 LINEAR OPERATORS 19

    i.e., what is the matrix representation of the linear operator T in the new

    basis?

    Since P

    2

    1

    =

    3

    1

    and P must be nonsingular (why?) and hence

    P1 exists, we have

    P1P

    2

    1

    = P1

    3

    1

    =

    2

    1

    = P1

    3

    1

    .

    It follows that

    A2

    1 = AP1

    3

    1 . (27)Similarly,

    P1

    2

    3

    =

    1

    4

    . (28)

    But

    A

    2

    1

    =

    1

    4

    and then (27) can be rewritten as

    AP13

    1

    = 14

    . (29)Substituting (28) into (29) gives

    AP1

    3

    1

    = P1

    2

    3

    ,

    or

    P AP1 3

    1 =

    2

    3 .

    Therefore, we obtain

    B = P AP1 =

    2 1

    1 1

    1 1

    2.5 1

    1 1

    1 2

    =

    1.5 2.5

    1.5 1.5

    .

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    20/27

    3 LINEAR OPERATORS 20

    It is easy to check that 1.5 2.5

    1.5 1.5

    3

    1

    =

    2

    3

    .

    Example 10. Let T: D(T ) R(T ) be a linear operator where D(T ) and

    R(T ) are in the same vector space X over F. An eigenvalue of T is a

    scalar F such that there is a nonzero vector g D(T ) with T g = g;

    g is called the eigenvector ofT associated with .

    Suppose dim D(T ) = n is finite and suppose we can find n linearlyindependent eigenvectors of T, g1, g2, . . . , gn. Then {gi} is a basis for

    D(T ) and every x D(T ) can be written as

    x =

    ni=1

    igi, i F .

    Then

    T x =

    ni=1

    iT gi =

    ni=1

    iigi =

    ni=1

    igi

    where i = ii F. That means, relative to the same basis {gi}, the

    coordinator vector of T x, [1 2 n], is related to the coordinator

    vector of x, [1 2 n], by

    1

    2...

    n

    = A

    1

    2...

    n

    ,

    where A has a simple form in this basis of eigenvectors:

    A =

    1 0. . .

    0 n

    .

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    21/27

    4 SYMMETRIC MATRICES AND QUADRATIC FORMS 21

    4 Symmetric Matrices and Quadratic Forms

    The transpose of a matrix A, denoted as AT, is obtained by taking as its

    ith column the ith row of A.

    Example 11.

    1. A1 =

    1 2 3

    1 0 4

    , AT1 =

    1 1

    2 0

    3 4

    .

    2. A2 =1 1

    4 2

    , AT2 = 1 41 2

    .

    Let A1 and A2 be given by the above two examples. We construct

    B1 = A1AT1 =

    1 2 3

    1 0 4

    1 1

    2 0

    3 4

    =

    14 11

    11 17

    ,

    B2 = AT1A1 =

    1 1

    2 0

    3 4

    1 2 31 0 4

    = 2 2 1

    2 4 6

    1 6 25

    ,

    B3 = A2AT2 =

    2 2

    2 20

    ,

    B4 = AT2A2 =

    17 7

    7 5

    .

    We observe two interesting properties of all matrices B1 to B4. First, they

    are all square. Second,

    BT1 = B1, BT2 = B2, B

    T3 = B3, B

    T4 = B4.

    In other words, every one of these matrices equal to its respective trans-

    pose. We call this kind of matrices symmetric.

    The following properties of a transpose are easy to prove:

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    22/27

    4 SYMMETRIC MATRICES AND QUADRATIC FORMS 22

    1. (AT)T = A;

    2. (kA)T = kAT, k is a constant;

    3. (A + B)T = AT + BT;

    4. (AB)T = BTAT;

    5. (A1)T = (AT)1.

    We prove in the following the last property.

    Since AA1 = I and A1A = I where I is the identity matrix, we have

    (AA1)T = (A1)TAT = IT = I

    and

    (A1A)T = AT(A1)T = IT = I.

    That means (A1)T is the inverse of AT.

    Now it is easy to see why AAT and ATA are necessarily symmetric,

    because

    (AAT)T = (AT)TAT = AAT,

    (ATA)T = AT(AT)T = ATA.

    A symmetric matrix is necessarily square since the requirement AT =

    A implies that A and AT must have the same dimension. However, AT

    and A can have the same dimension only if A is square. It follows that

    we can talk about A1 if A is symmetric. The inverse may not exist. For

    example,

    A =

    1 2

    2 4

    is symmetric but it has no inverse (since det A = 0). However, if theinverse of a symmetric matrix exists, it must also be symmetric. This is

    because (A1)T = (AT)1 and ifA is symmetric, AT = A, or (A1)T = A1.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    23/27

    4 SYMMETRIC MATRICES AND QUADRATIC FORMS 23

    Let x Rn1 be a column vector and A Rnn be a symmetric matrix,

    then xTAx is a scalar q given by

    q = xTAx =

    x1 x2 xn

    a11 a12 a1n

    a21 a22 a2n...

    ...

    an1 an2 ann

    x1

    x2...

    xn

    = a11x21 + a12x1x2 + a21x2x1 + + annx

    2n

    =

    n

    i=1n

    j=1aijxixj .

    (30)

    We call q = xTAx a quadratic form.

    Example 12.

    q =

    x1 x2 1 2

    2 2

    x1

    x2

    = x21 4x1x2 + 2x22 .

    Notice that, for x 0, q can be positive (e.g., when x =

    1

    1

    ) ornegative

    (e.g., when x =

    1

    1

    ).

    A symmetric matrix A is positive definite if

    x 0 = xTAx > 0, (31)

    and positive semidefinite if

    x 0 = xTAx 0. (32)

    A very broad class of matrices are automatically positive definite.

    Theorem 7. Let B be an m n matrix and let A be given by

    A = BTB.

    Then A is positive semidefinite. If furthermore the n columns of B are

    linearly independent, then A is positive definite.

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    24/27

    5 SOLUTION OF ALGEBRAIC EQUATIONS 24

    Proof. Let x Rn1 be nonzero and y = Bx . Thus y is an m 1 vector.

    We have shown before that BTB is symmetric. Furthermore,

    xTAx = xTBTBx = yTy =

    mi=1

    y2i 0.

    Hence A is positive semidefinite. Since y is the linear combination of the

    n columns ofB with x1, x2, . . . , xn as scalar multipliers, if the n columns

    ofB are linearly independent, y = Bx = 0 if and only ifx = 0. Now x 0

    implies y 0. Therefore xTAx = yTy > 0, and A is positive definite.

    We learned that g 0 is an eigenvector of A associated with the

    eigenvalue if Ag = g.

    Lemma 1. Let be an eigenvalue of A and g its associated eigenvector.

    Then

    =gTAg

    gTg. (33)

    Proof. We have Ag = g, g 0, and gTAg = gTg. This gives

    =gTAg

    gTg.

    Theorem 8. The eigenvalues of a positive definite matrix are positive and

    the eigenvalues of a positive semidefinite matrix are nonnegative.

    Proof. Since =gTAg

    gTgand gTg > 0, if gTAg > 0, > 0. Similarly, 0

    if gTAg 0. Note that gTg > 0 is true because we can prove that the

    eigenvectors of a symmetric matrix are all real.

    5 Solution of Algebraic Equations

    The solution of the system of algebraic equations

    a11x1 + a12x2 + + a1nxn = y1,

    a21x1 + a22x2 + + a2nxn = y2,...

    am1x1 + am2x2 + + amnxn = ym

    (34)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    25/27

    5 SOLUTION OF ALGEBRAIC EQUATIONS 25

    is identical to the solution of the matrix equation

    Ax = y, (35)

    where

    A =

    a11 a12 a1n

    a21 a22 a2n...

    am1 am2 amn

    is a given m n matrix, y = y1 y2 ymT

    is a given vector, and

    x =

    x1 x2 xnT is an unknown to be found.

    The solution of (35) is closely related to the "rank" of the matrix A,

    which is defined as follows.

    Definition 12. For an m n matrix A, its m rows generate a subspace

    and the dimension of this subspace is called the row rank ofA. Similarly,

    the dimension of the subspace generated by the columns of A is called

    the column rank ofA.

    It can be shown that the row rank and the column rank of a matrix isidentical. Thus we can simply say A has the rank r, or write rank(A) = r.

    Clearly r min(m,n). When r = min(m, n), A is said to have full rank.

    Example 13. For the matrix

    A =

    1 1 1 1

    1 2 3 4

    2 4 6 8

    ,

    rank(A) = 2. This is because the number of independent rows is two.

    Returning to the solution of (35). If m = n and A is of full rank,

    det A 0. It follows that A1 exists and A1A = AA1 = I, I is the

    identity matrix. In this case, x is simply given by

    x = A1y (36)

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    26/27

    5 SOLUTION OF ALGEBRAIC EQUATIONS 26

    since, multiplying both sides of (35) by A1 gives

    A1Ax = A1y, (37)

    of which the left-hand side is nothing but Ix = x.

    When A is not of full rank, or m n, the solution of (35) is not so

    simple. First, consider its existence.

    Theorem 9. The matrix equation (35) has a solution if and only if

    rank([A|y]) = rank(A)

    where [A|y] is the augmented matrix formed by A and y.

    Proof. Suppose there exists an x such that Ax = y. Then y is the linear

    combination of the columns ofA and hence lies in Im A (the image ofA).

    It follows that Im [A|y] = Im A. Since the dimension of Im A (which is a

    vector space) equals to the rank of A, we have rank([A|y]) = rank(A).

    Conversely, we note that Im A Im [A|y]. Hence if rank([A|y]) =

    rank(A), Im A and Im [A|y] have the same dimension and are therefore

    equal, i.e., they are the same vector space. Hence y Im A, i.e., y is the

    linear combination of the columns of A. That means y = Ax for some

    x Rn.

    Corollary 2. If rank(A) = m, then (35) always has a solution.

    Proof. Since the columns of A and [A|y] are m-dimensional vectors,

    rank(A) rank([A|y]) m. If rank(A) = m, then m = rank(A)

    rank([A|y]) m implies rank(A) = rank([A|y]). Hence (35) has a solu-

    tion.

    The following theorem answers the question ofuniquenessof a solu-

    tion.Theorem 10. Let x0 be a solution of (35). Then the set of all solutions is

    x0 + N(A)

    where N(A) is the null space of A.

    e.g., A =

    1 2

    3 4

    , y =

    5

    6

    , then [A|y] =

    1 2 5

    3 4 6

    .

    C.P. Kwong

  • 8/3/2019 01 Linear Algebra

    27/27

    5 SOLUTION OF ALGEBRAIC EQUATIONS 27

    Proof. Ifu is a solution of (35), then Au = Ax0 or A(ux0) = 0, where 0 is

    the zero vector. Hence (ux0) N(A) and u = x0+(ux0) x0+N(A).Conversely ifu x0 +N(A), then u = x0 +z for some z satisfying Az = 0.

    Hence Au = Ax0 + Az = Ax0 = y.

    Thus, given a solution of (35), adding to this solution any vector in

    the null space ofA results in another solution. This proves the following

    result.

    Corollary 3. A solution of (35) is unique if and only if N(A) contains

    only the zero vector.

    The homogeneous equation Ax = 0 always has the trivial solution

    x = 0. From the uniqueness Theorem 10, the set of all solutions is the

    subspace 0 + N(A) = N(A). Thus Ax = 0 has a nontrivial solution if

    and only if N(A) is non-empty, the latter is true if n > m (prove it!). It

    follows that Ax = 0 always has a nontrival solution if n > m.