discrete mathematics notes - 2012

56
6301 Discrete Mathematics for Computer Scientists, Spring 2012 Alexander V. Sobolev 1 Department of Mathematics, University College Lon- don, Gower Street, London, WC1E 6BT E-mail address : [email protected] 1 http://www.homepages.ucl.ac.uk/ucahaso

Upload: yee-chong-94

Post on 02-Oct-2015

32 views

Category:

Documents


8 download

DESCRIPTION

Discrete Math for Computer Scientist

TRANSCRIPT

  • 6301 Discrete Mathematics for

    Computer Scientists, Spring 2012

    Alexander V. Sobolev1

    Department of Mathematics, University College Lon-don, Gower Street, London, WC1E 6BT

    E-mail address : [email protected]

    1http://www.homepages.ucl.ac.uk/ucahaso

  • 2000 Mathematics Subject Classification.

    .

    Abstract. This is is a draft a one term course for 1st year Com-puter Science students.

  • Contents

    Chapter 1. Set Theory, Permutations, Groups 11. Basic definitions and concepts 12. Advanced operations on sets 33. Functions 44. Permutations and Groups 75. Binary operations and Groups 106. Equivalence relations and Lagranges Theorem 13

    Chapter 2. Elementary Number Theory 191. Basic definitions and concepts 192. Congruences 23

    Chapter 3. Linear Algebra 311. Basic definitions and concepts 312. Matrices and linear maps 353. Linear systems 374. Inverting square matrices 405. Determinants 426. Eigenvalues and eigenvectors 46

    i

  • CHAPTER 1

    Set Theory, Permutations, Groups

    1. Basic definitions and concepts

    1.1. Basic definitions. A set is a collection of elements. Everyelement is either in the set or not in the set. Sets are written with theelements separated by commas and enclosed in curly brackets: e.g.,{3, 10, 5} is a set with three elements, which are 3, 10 and 5. We useletters to denote sets: A, S etc. If x is an element of the set S, we writex S. If x is not an element of the set we write x / S. For example,denoting the above set by B, we can claim that 3 B, but 4 / B.

    If A,B are two sets and every element of A is also an element ofB, we say that A is a subset of B, and write A B. This sometimesis expressed as follows: A is a subset of B if for every x A one alsohas x B.

    In the beginning we shall be concerned with the properties of setsirrespectively of their nature, but we shall use a very simple tool topicture them. Very often it is convenient to consider sets as parts of abigger universal set, which we denote E, so that A E, B E etc.Then it is convenient to use the so-called Venn diagrams to visualisethe sets.

    Two sets are equal, i.e. A = B, if every element of A is also anelement of B, and every element of B is an element of A. In otherwords A = B if A B and B A. Note that when we want to provethat A = B we always check these two inclusions. For example, thesets {3, 10, 5} and {3, 5, 10} are equal, i.e. the same.

    If A B and A 6= B, then A is said to be a proper subset of B. Inthis case we write A B.

    Example. (1) is empty set, i.e. the set with no elements.For any set A we have A.

    (2) The set of natural numbers: N = {1, 2, 3, . . . }.(3) The set of integer numbers: Z = {. . . ,1, 0, 1, 2, . . . }.(4) The set of integers between 5 and 7:

    A = {m Z : 5 m 7}.Note the notation, describing the property of the elements!

    1

  • 2 1. SET THEORY, PERMUTATIONS, GROUPS

    (5) The set of rational numbers: Q = {mn

    : p Z, n N}.(6) Real numbers: R. This set includes such numbers as

    2,

    3etc., which cannot be described as rational numbers.

    Real numbers are depicted as points on the straight line.(7) Set of coins in my pocket. Empty set?

    1.2. Elementary operations on sets.Union: for two sets A and B the union A B is a new set whose

    elements are either in A or in B. In other words, x A B if eitherx A or x B. Note A B = B A. Venn diagram, truth table.

    Intersection: given sets A and B the intersection AB is the setof elements which are both in A and B. In other words, x A B ifx A and x B. Note A B = B A. Venn diagram, truth table.

    Difference: A \ B is the set which consists of elements which arein A, but not in B, i.e. x A \ B if x A and x / B. Note thatA \B 6= B \ A! Venn diagram, truth table.

    Symmetric difference: AB = (A \ B) (B \ A), i.e. it is theset of elements, which are either in A or B, but not in both. Venndiagram, truth table. Note AB = BA.

    The complement: the set Ac, whose elements are in the universalset, but not in A, i.e. Ac = E \ A. Venn diagram, truth table.

    Theorem 1.1 (De Morgans Law). Let A,B be two subsets of auniversal set E. Then

    Ac Bc = (A B)c,(1.1)Ac Bc = (A B)c.(1.2)

    Proof. Let us prove (1.1). Let us use the definitions. To provethat two sets coincide we need to check that Ac Bc (A B)c and(A B)c Ac Bc.

    To prove the first inclusion assume that x Ac Bc, i.e. x / A orx / B. This means that x / A B, i.e. x (A B)c.

    Assume now that x (A B)c, that is x / A or x / B, whichmeans that x Ac or x Bc, and hence x Ac Bc.

    Similarly one proves (1.2). Alternative method 1: Venn diagrams. Alternative proof 2: Truth Tables. Theorem 1.2. For any three sets A,B,C we have

    (AB)C = A(BC), A (B C) = (A B) (A C),A (B C) = (A B) (A C)

  • 2. ADVANCED OPERATIONS ON SETS 3

    2. Advanced operations on sets

    Infinite unions and intersections. If we have many (possiblyinfinitely many) sets it is convenient to label them using an indexset I. For example, the set of natural numbers can be an index set/Then we write A1, A2, . . . for the infinite collection of sets. In general,Ai, i I denotes a collection of sets labeled by i from the chosen indexset I.

    Example. An = [0, 1 1/n], n = 1, 2, . . . , so that I = N.B = [0, ], [0, 1], so that I = [0, 1].Then we define:

    iIAi = {x : x Ai for some i I}

    = {x : i I : x Ai};iIAi = {x : i I : x Ai}.

    Theorem 1.3 (Generalised De Morgans Laws).(iIAi

    )c=iIAci ,

    (iIAi

    )c=iIAci ,

    Proof. Venn diagrams or truth tables wouldnt help! Suppose

    that x (

    iI Ai

    )c, that is x / iI Ai. This means, that for all

    i I we have x / Ai, i.e. x Aci . Therefore x iI A

    ci .

    Conversely, assume that x iI Aci . This means that for everyi I we have x Aci , i.e. for no i I we have x Ai, and hence x isnot in

    Ai, as required.

    Similarly one proves the second statement. Ordered pairs and products. We have already seen that the

    order of elements does not matter for a set. However, it is useful tointroduce the so-called ordered pairs, i.e. sets of the form (x, y),where x, y are elements of some sets X and Y . In this object the orderof elements is important.

    Example. Points on the plane are represented by their coordinates:(x, y). The points (1, 2) and (2, 1) are different!

    Given two sets X and Y we define the product set or simplyproduct X Y as the set of ordered pairs (x, y), where x X andy Y .

  • 4 1. SET THEORY, PERMUTATIONS, GROUPS

    Example. Points on the plane is the product R.The product [0, 1] [0, 1] is the unit square.{0, 1} {a, b, c} = {(0, a), (0, b), (0, c), (1, a), (1, b), (1, c)}.Proposition 1.4. (A B) C = (A C) (B C).Proof. Let (x, y) (A B) C, that is x A B and y C.

    Then either x A or x B. Thus (x, y) A C or (x, y) B Crespectively. In both cases (x, y) (A C) (B C).

    Conversely, if (x, y) (AC) (BC), then either (x, y) ACor (x, y) B C, so that either x A or x B with y C, whichmeans that x A B, as required.

    Power sets. For any set X the power set P (X) is the set of allsubsets of X.

    If X has n elements, then P (X) will have 2n elements.

    Example. If X = {0, 1}, thenP (X) = {, {0}, {1}, {0, 1}}.

    3. Functions

    3.1. Preliminaries. Let X and Y be sets.

    Definition 1.5. A function (or mapping) from X to Y is a rulewhich assigns to each element x X a unique element y Y . Nota-tion: y = f(x) and f : X Y .

    The element y is called the image of x, and x is called the pre-image of y. The set R(f) = {y Y : y = f(x), x X} is called theimage(or range) of the function.

    For any set the notation idX stands for the the function which doesnothing to x X, it is called the identity on X.

    We have the following operations on functions:

    Definition 1.6. Let f : X Y and g : Y Z. The function(g f)(x) = g(f(x))

    is called the composition of f and g.If f : X Y and g : Y X are functions such that g f = idX ,

    then g is called a left inverse of the function f . If f g = idY , then gis a right inverse of f . A function g which is a left and right inverse,is said to be a two-sided inverse, or simply inverse.

    Note the obvious identities:

    idY f = f, f idX = f,

  • 3. FUNCTIONS 5

    andf (g h) = (f g) h.

    Example. Let X = {1, 2, 3}, Y = {a, b}. Define f by f(1) =a, f(2) = f(3) = b. Define g by g(a) = 1, g(b) = 3. Then (f g)(a) = aand (f g)(b) = b, so that g is a right inverse.

    On the other hand (g f)(2) = g(b) = 3, so g is not a left inverse!Note that the right inverse is not unique. Indeed, let h : Y X be

    defined by h(a) = 1, h(b) = 2. Then (f h)(a) = a and (f h)(b) = b,so that h is another right inverse.

    Example. Let f(x) = ex, f : R R To construct an inverse defineon R+ = {t R : t > 0} the function h(t) = log t as a unique numberx R such that ex = t. By this definition (h f)(x) = log(ex) = x,for all x R, so that h is a left inverse of ex. If we want to make thisfunction the right inverse, we need to re-define f as a function from Rinto R+. Then (f h)(t) = elog t = t for all t > 0.

    3.2. Types of functions. For applications it is crucial to be ableto find inverses. This leads us to the following definition:

    Definition 1.7. A function f : X Y is said to be injective(or an injection) if for all a, b X, if a 6= b one has f(a) 6= f(b). Inother words, if f(a) = f(b), then a = b, i.e. each y Y has only onepre-image.

    It is said to be surjective (or a surjection) if for every y Y thereis at least one pre-image, i.e. an element x X such that f(x) = y.

    The function is said to be bijective (or a bijection) if it is bothinjective and surjective.

    Theorem 1.8 (Inverses Theorem). Let f : X Y be a functionbetween non-empty sets X, Y . Then

    (1) f has a left inverse iff f is injective;(2) f has a right inverse iff f is surjective;(3) f has a two-sided inverse iff f is bijective.

    Proof. (1) Let f have a left inverse, that is g f = idX . Thenfrom f(a) = f(b) we infer that g(f(a)) = g(f(b)) = a = b, i.e. f isinjective.

    Let f be injective. For each y R(f) define x = g(y), where x isthe uniquely defined element such that f(x) = y. Then g f = idX .

    (2) Let f have a right inverse, i.e. f h = idY . Thus for each y Ywe have f(x) = y with x = h(y), so that y has a pre-image, as required.

    Suppose that f is surjective, i.e. each y has at least one pre-image.Denote it by h(y). Then f(h(y)) = y by construction.

  • 6 1. SET THEORY, PERMUTATIONS, GROUPS

    (3) If f has a two-sided inverse, then it is injection and bijection by(1), (2). Conversely, by (1) and (2) we know that f has left and rightinverses, g and h. Thus it remains to show that g = h. Write:

    g = g idY = g (f h) = (g f) h = idX h = h.

    3.3. Calculating inverses. Let f(x) = x2, f : R R+. ThisDefine g(t) =

    t for all t 0. Then

    (f g)(t) = (t)2 = t,so that we have a right inverse. Left inverse? I does not exist, sincethe function is not injective! The rule: reflect the graph of the functionin the line y = x and throw the part which prevents it from being afunction!

    For the injective function we do what we did for the exponential.

    3.4. Countability.

    Definition 1.9. A set X is countable if there exists a bijectionbetween X and N.

    Sometimes finite sets are also called countable.

    Theorem 1.10. Z is countable.

    Proof. For each n Z define

    F (n) =

    {2n, n > 0;

    2(n) + 1, n 0.It is bijective, since it has a two-sided inverse:

    g(m) =

    {m/2,m even;

    (1m)/2, m odd.

    Theorem 1.11. The set N N is countable.Proof. Arrange the pairs (m,n),m, n N in a table, and count

    them.

    As a corollary, the set Q is countable as well.However, the set of real numbers R is not countable!

  • 4. PERMUTATIONS AND GROUPS 7

    4. Permutations and Groups

    4.1. Permutations, cycles.

    Definition 1.12. Let n N. A permutation of degree n is abijection : {1, 2, . . . , n} {1, 2, . . . , n}. Notation: Sn is the set ofall permutations of degree n.

    The set Sn is called the symmetric group of degree n.

    Example. (1) The only permutation of degree 1 is the func-tion that takes 1 into 1.

    (2) If n = 2, there are two permutations.(3) If n = 3, there are 6 permutations. In general, there are n!

    permutations of degree n.

    We often use the notation(1 2 . . . n

    (1) (2) . . . (n)

    ).

    For example, S3 contains the element(1 2 32 1 3

    ).

    If , Sn are two permutations of the same degree n, we can constructtheir composition . Since both and are bijections, the resultis again a permutation. For example:(

    1 2 3 4 51 3 2 5 4

    )(1 2 3 4 52 1 4 5 3

    )=

    (1 2 3 4 53 1 5 4 2

    ).

    Note that in general 6= .Since permutations are bijections, they have inverses. The inver-

    sion is done by swapping the lines and re-arranging the columns so thatthe top row is back to 1, 2, . . . , n. For example,(

    1 2 3 4 52 1 4 5 3

    )1=

    (1 2 3 4 52 1 5 3 4

    ).

    Every permutation can be represented as a composition of cycles:

    Definition 1.13. Let a1, a2, . . . , ar {1, 2, . . . , n}, r n be dis-tinct numbers. The permutation defined by

    (a1) = (a2), (a2) = 3, . . . , (ar) = a1,

    and (m) = m if m / {a1, a2, . . . , ar}, is called a cycle of length r andit is denoted (a1a2 . . . ar)

  • 8 1. SET THEORY, PERMUTATIONS, GROUPS

    Example. (1) In S6:

    (1435) =

    (1 2 3 4 5 64 2 5 3 1 6

    ).

    (2) Not all permutations are cycles, but all of them can be repre-sented as composition of cycles:(

    1 2 3 4 52 1 5 3 4

    )= (12)(354).

    (3) Another observation: cycles can be written in more than oneway: (123) = (231).

    (4) It is a convenient object to take a power of. Indeed (354)3 = id,so that (354)24 = id and (354)26 = (354)2 = (345). (1534)246 =(1534)2 = (13)(45).

    Definition 1.14. Two cycles (a1a2 . . . ar) and (b1b2 . . . bs) are calleddisjoint if they have no common elements.

    For example, (12) (56) are disjoint, but (12) and (158) are not.

    Proposition 1.15. If = (b1b2 . . . bs) and = (a1a2 . . . ar) aredisjoint cycles, then = .

    Proof. We want to show that ()(m) = ()(m) for each m {1, 2, . . . , n}.

    Suppose that m {a1, a2, . . . , ar}, so that m / {b1, b2, . . . , bs}.Thus (m) = m and (m) / {b1, b2, . . . , bs}, so that (m) = (m)and (m) = (m).

    The same is true if m {b1, b2, . . . , bs}.Now assume that m is in neither of the two sets, so that (m) =

    (m) = m, so that the claimed identity is trivial.

    As a consequence, ()k = k k for any two disjoint cycles.

    Theorem 1.16. Any permutation can be represented as a compo-sition of disjoint cycles.

    Proof. Start with 1 and write (1, (1), 2(1), . . . , k} where k isthe minimal number such that k+1(1) = 1. Clearly, k n. Now takethe minimal m, which is not in the above set and repeat the procedure:{m,(m), 2(m), . . . , l(m)}, where l is the smallest number such thatl+1(m) = m. This cycle is disjoint with the previous one. Continueuntil all elements are exhausted.

  • 4. PERMUTATIONS AND GROUPS 9

    Example. (1) Let

    =

    (1 2 3 4 5 6 77 6 5 4 3 2 1

    )Then

    (1) = 7, (7) = 1, so we have a cycle (17).(2) = 6, (6) = 2, so (26).(3) = 5, (5) = 3, so (35).(4) = 4, and hence = (17)(26)(35).

    (2)

    =

    (1 2 3 4 5 6 7 8 93 2 5 8 1 7 6 9 4

    )= (135)(489)(67).

    (3)

    =

    (1 2 3 4 5 65 1 3 2 4 6

    )= (1542).

    Now to raise a permutation to power k represent it as a prod-uct(composition) of disjoint cycles = 12 . . . p, and then write

    k = k1

    k2 . . .

    kp .

    Example.(1 2 3 4 5 6 7 8 92 4 5 9 3 8 6 7 1

    )1234= (1249)1234(35)1234(687)1234

    = (1249)2 id(687) = (14)(29)(687).

    Definition 1.17. For a permutation the order of is the smallestnumber k such that k = id.

    For a cycle the order is its length. Clearly, such a number exists,since every permutation is a composition of cycles.

    Theorem 1.18. If = 1 . . . p with cycles of lengths l1, . . . lp, thenthe order of the permutation equals the least common multiple of thelengths.

    Proof. Let l = lcm(l1, l2, . . . , lp). Then l = id, since lj = id for

    all j = 1, 2, . . . , p.Let us prove the converse, i.e. that k 6= id for all k < l. Suppose

    that k = id, so that one of the kj is not id. Assume that k1 6= id. Then

    there is an x {1, 2, . . . , n} such that k1 (x) 6= x and k2 . . . kp (x) = x,and hence k(x) 6= x, which gives a contradiction.

  • 10 1. SET THEORY, PERMUTATIONS, GROUPS

    4.2. Transpositions.

    Definition 1.19. A transposition is a cycle of length 2.

    Example: (12). Every permutation can be written as a product oftranspositions. To see this, decompose as a product of disjoint cycles: = 12 . . . r. So, it suffices to represent each cycle as a product oftranspositions:

    (a1a2 . . . ar) = (a1a2)(a2a3) . . . (ar1ar).

    The number of transpositions in this representation is r 1.Example.(

    1 2 3 4 5 6 7 8 93 2 5 8 1 7 6 9 4

    )= (135)(489)(67)

    = (13)(35)(48)(89)(67).

    (1 2 3 4 5 6 7 8 92 4 5 9 3 8 6 7 1

    )= (1249)(35)(687)

    = (12)(24)(49)(35)(68)(87).

    Definition 1.20. Let be a cycle of length l. Then the signatureof is defined to be () = (1)l1.

    For a permutation written as = 12 . . . r with disjoint cyclesj, j = 1, 2, . . . , l, the signature is () =

    rj=1 (j) = (1)

    j(lj1).

    A permutation is called even if () = 1, and odd if () = 1.From the construction it is clear that for any two permutations

    () = ()().

    5. Binary operations and Groups

    5.1. Definitions. Let X be a set. A binary operation on X is afunction : X X X. Instead of (x, y) we write x y.

    Example. (1) X = Z, x y = x + y - the usual addition ofintegers.

    (2) X = P (R), the power set of R: A B = A B.(3) X = R: x y = xy - the usual product.

    Alternatively, we can take X = R \ {0}.(4) X = R: x y = y.(5) X = Sn: = .

    A binary operation is said to be associative if it satisfies the con-dition x, y, z, (x y) z = x (y z).

  • 5. BINARY OPERATIONS AND GROUPS 11

    Example. (1) X = Z: (x+ y) + z = x+ (y+ z) for all integerx, y, z.

    (2) Same for examples 2, 3, 4, 5 above.For example 4:

    (x y) z = z, x (y z) = x z = z,so it is also associative.

    (3) X = R, x y = (x+ y)/2. Then

    (x y) z =x+y2

    + z

    2=x+ y + 2z

    4,

    and on the other hand,

    x (y z) = x+y+z2

    2=

    2x+ y + z

    4,

    so this operation is not associative.

    An element x X is called an identity element for the operation if it satisfies x y = y x = y for all y X.

    Example. (1) X = Z: for the operation x y = x + y thenumber 0 is the identity element.

    (2) X = R with x y = xy: the identity is 1.(3) X = P (R), A B = A B. The identity is .(4) X = Sn, = . The identity is id.

    There is at most one identity element. Indeed, suppose there aretwo: e1, e2. Then e1 = e1 e2 = e2. There may be no identity elementat all. For example, let X = R and xy = y. Suppose e is the identity,so x = x e = e and x = e x = x. If x 6= e, we have a contradiction.

    From now on we denote the identity by 1X .Suppose that is associative, and has an identity. An element

    x X is called an inverse of an element y X if x y = y x = 1X .An element may have at most one inverse. Indeed, assume that y hastwo inverses: x, z, so that

    x = x 1X = x (y z) = (x y) z = 1X z = z.It might happen that an element has no inverse at all. For example,for X = P (R) and A B = A B the identity is , but unless A = ,one cant find a B such that A B = .

    Definition 1.21. A group is a pair (G, ) where G is a non-emptyset and is an operation satisfying:

    (1) is associative;(2) There is an identity for ;

  • 12 1. SET THEORY, PERMUTATIONS, GROUPS

    (3) Every element g G has an inverse.If the group has finitely many elements, it is called finite, and the

    number of elements is called order of the group, denoted #G.

    Example. (1) (Z,+) is a group.(2) (R \ {0},) is a group.(3) (Sn, ) is a group.(4) (R, ), x y = y is not a group.(5) (P (R),) is not a group.(6) (P (X),) is a group for any set X.

    The last example is formatted as a Theorem:

    Theorem 1.22. The pair (P (X),) is a group.

    Proof. Note first that is a binary operation. To prove that itis associative, we can use the truth tables.

    A B C AB BC (AB)C A(BC)1 1 1 0 0 1 11 1 0 0 1 0 01 0 1 1 1 0 01 0 0 1 0 1 10 1 1 1 0 0 00 1 0 1 1 1 10 0 1 0 1 1 10 0 0 0 0 0 0

    It shows that is indeed associative, i.e (AB)C = A(BC).Clearly, is the identity: A = A = A.Each element is its own inverse: AA = . 5.2. Dihedral group D8. This is the group of symmetries of a

    square. It consists of eight operations which preserve the square: fourrotations and four reflections: {id, R1, R2, R3, F1, F2, F3, F4}. Any twosymmetries can be composed to give another symmetry, e.g. R1F2 =F3. The inverses of the elements are

    id1 = id, R11 = R3, R12 = R2, R

    13 = R1,

    F11 = F1, F12 = F2, F

    13 = F3, F

    14 = F4.

    If one attaches labels to the vertices of the square in the counterclock-wise direction, one can calculate compositions of symmetries by repre-senting them as permutations:

    R1 = (1234), R2 = (13)(24), R3 = (1432),

    F1 = (12)(43), F2 = (13), F3 = (14)(23), F4 = (24).

  • 6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM 13

    Now it is easy to find the compositions, e.g.

    F1R2F3R3F4 = (12)(43)(13)(24)(14)(23)(1432)(24)

    =

    (1 2 3 44 3 2 1

    )= (14)(23) = F3.

    Now one can draw the multiplication table.More generally, a regular n-gon has 2n symmetries, which form a

    group called dihedral group D2n of order 2n.

    5.3. Subgroups.

    Definition 1.23. Let (G, ) be a group. A subset H G is saidto be a subgroup if (H, ) is a group. In other words, H is a groupwith respect to the operations on G. This is equivalent to saying that

    (1) For any g, h H we have g h H,(2) For every g H also g1 H.Example. (1) {1G} is a subgroup of any group G.(2) G is its own subgroup.(3) Let g G. Denote by g the subgroup {gk, k Z}, where

    g0 = 1G. If the group G is finite, then

    g = {1G, g, g2, . . . , gn1},where n is the smallest number such that gn = 1G. This sub-group is called cyclic. The element g is called the generatingelement. The number n is called the order of the subgroupand it is written as o(g).

    (4) The dihedral group D2n is a subgroup of Sn.

    6. Equivalence relations and Lagranges Theorem

    6.1. Definitions.

    Definition 1.24. Let X be a set. A relation on X is a subsetR X X. If (x, y) R, then we write xRy and say that x is relatedto y.

    Example. (1) X any set and R = . This relation does notrelate any pairs.

    (2) X any set and R = {(x, x)}. Then each x X is related toitself.

    (3) X = P (N) and R = {(A,B) : A B N}. Then is relatedto any element, and any element is related to N. {1}R{1, 2},but {1} is not related to {3}.

    (4) X = R, R = {(x, y) : x y}.

  • 14 1. SET THEORY, PERMUTATIONS, GROUPS

    (5) X = R, R = {(x, y) : x 6= y}.We focus on relations satisfying some extra conditions.

    Definition 1.25. A relation R is called reflexive if xRx for anyx X.

    A relation R is called symmetric if xRy implies yRx.A relation R is called transitive if xRy, yRz imply xRz.A relation R is said to be an equivalence relation if it is reflexive,

    symmetric and transitive.

    Example. Example 2 above defines an equivalence relation.Example (3) is reflexive, transitive, but not symmetric.Example (4) is transitive, reflexive, but not symmetric.Example (5) is not even reflexive! It is symmetric, and not transi-

    tive.

    Definition 1.26. Let X = Z, and let n N be a number. Thenwe say that two integer numbers m, p are congruent modulo n iffm p is divisible by n. Notation: m p(n) or m p mod n.

    Proposition 1.27. Congruence modulo n is an equivalence rela-tion.

    Proof. Since m = m, we also have m m(n), since 0 is divisibleby n. THis proves reflexivity.

    Suppose m p(n), i.e. m = p + nk with some k Z, so p m =nk, and hence p m(n). This means symmetry.

    Suppose m p(n) and p = q(n), that is m = p+nk and p = q+nlwith some integer k, l. Therefore m = q + n(l + k), i.e. m q(n).Transitive.

    Examples: 8 = 5(3), 31 = 1(10) etc.

    6.2. Partitions and equivalence relations.

    Definition 1.28. Let X be a set. A partition of X is a set ofsubsets Xj X such that every element belongs to exactly one of theXjs. In other words, Xj Xk = if j 6= k and jXj = X.

    Example. (1) IfX = {1, 4, 5, 8, 6}, then the setsX1 = {1, 4, 5}and X2 = {8, 6} form a partition.

    (2) If X = R, then the sets

    X1 = {x R : x < 0}, X2 = {x R : 0 x 1}, X3 = {x R : x > 1}form a partition.

  • 6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM 15

    A remarkable fact is that any equivalence relation induces a parti-tion!

    Definition 1.29. Let R be an equivalence relation on the set X.For any x X the equivalence class [x] is defined to be the set{y X : xRy}.

    Note that if xRy, then [x] = [y].

    Theorem 1.30. The equivalence classes form a partition of X.

    Proof. We need to show that (1) every element x X belongs toan equivalence class and (b) no element is in two distinct classes.

    (a) Is clear, since x [x], since R is reflexive.(b). Suppose that x [w] and x [z]. Let us show that [w] = [z].

    We have xRw, and by symmetry wRx. On the other hand xRz, sothat by transitivity, wRz, so that [w] = [z].

    6.3. Lagranges Theorem.

    Theorem 1.31. Let H be a subgroup of a finite group G. Then#H divides #G.

    Proof. Let us define an equivalence relation on the group andprove that each equivalence class contains #H elements. This willmean that n #H = #G, where n is the number of equivalence classes.

    For any g, g G we say that g g iff gg1 H. It is reflexive,since for any g we have gg1 = I H. it is symmetric since underthe condition gg1 = h H we also have gg1 = h1 H. Fortransitivity assume that g1 g2, g2 g3, so that g1 = h1g2 andg2 = h2g3, so that g1 = h1h2g3, whence g1g

    13 H. Thus is an

    equivalence relation.Let us show that each equivalence class has #H elements. For

    g G define f : H [g] as follows: f(h) = hg. For each g g wehave g = hg, so this map is a surjection. Assume that f(h) = f(h),that is hg = hg. Multiplying by g1 on the right we get h = h.Therefore this map is an injection. Thus we have a bijection, andhence the number of elements in [g] is exactly #H.

    Corollary 1.32. For any g G the order o(g) divides #G. If#G = n, then gn = I.

    Proof. The set H = g is a cyclic subgroup with o(g) = #H :=m. By the Lagranges Theorem o(g) divides #G, i. e. n = mk withsome natural k. Furthermore, gn = (gm)k = Ik = I.

    Corollary 1.33. If #G is a prime number, then G is a cyclicgroup.

  • 16 1. SET THEORY, PERMUTATIONS, GROUPS

    6.4. Applications of Lagranges Theorem.

    (1) Subgroups of S3. The group has 6 elements. The orders ofpossible subgroups are 1, 2, 3, 6.(a) The subgroup of order 1 is {id}.(b) Order 2. The generating element has order 2, and hence it

    is a transposition. Thus possible subgroups are {id, (1, 2)},{id, (2, 3)}, {id, (1, 3)}.

    (c) Order 6: the group itself: S3.(d) Order 3. Subgroups are cyclic, since 3 is a prime. Thus

    the the generating element has order 3. The only elementsof order 3 are cycles, and there are only two cycles in S3:(123) and (132). Moreover, one of them is the squareof the other: (123)2 = (132). Therefore the subgroup is{id, (123), (132)}.

    (2) Subgroups of D10 (group of symmetries of a regular pentagon).Since 10 = 2 5, we have possible subgroups of orders 2 or 5.Both types are cyclic, since 2 and 5 are both prime.

    (3) Subgroups of D8. Divisors are 1, 2, 4, 8. Recall that the groupD8 consists of 8 elements:

    id, R1 = (1234), R2 = (13)(24), R3 = (1432),

    F1 = (12)(43), F2 = (13), F3 = (14)(23), F4 = (24).

    (a) Order 1: {id}.(b) Order 8: D8.(c) Order 2: this is a cyclic subgroup. There are four elements

    of order 2: reflections F1 = (12)(34), F2 = (13), F3 =(14)(23), F4 = (24). Thus we have subgroups {id, F1},{id, F2}, {id, F3}, {id, F4}.

    (d) Order 4: this is not a prime, so the subgroup is not nec-essarily cyclic. We need to consider two cases:

    (i) Suppose that H is cyclic. Thus it is generated by anelement of order 4The group D8 contains only twoelements of order 4: R1 and R3 = R

    31. Both these el-

    ements generate the same subgroup {id, R1, R21, R31}of order 4.

    (ii) Suppose that H is not cyclic, so that no element ofH has order 4, and hence they may have only orders1 or 2. The elements of order 2 are

    R2, F1, F2, F3, F4.

  • 6. EQUIVALENCE RELATIONS AND LAGRANGES THEOREM 17

    However, we cannot put them arbitrarily in groupsof four. The result may not be a group. For in-stance, if we take {id, F2, F3, F4}, then F2F4 = R2,and hence the group operation does not preservethis set! We have to be more careful and need to findsome extra features of the subgroup which would al-low us to pick the right elements. Write the soughtsubgroup in the form {id, A,B,C}, where A,B,Care of order 2, so that A2 = B2 = C2 = id. Sincenone of the elements, except for id, is the identity,we have AB = C and BA = C, so that AB = BA,i.e. commute with each other. Thus the subgroupis commutative. Hence we need to look for pairsof commuting elements. Clearly, among the reflec-tions, the only pairs are F2, F4 and F1, F3, sinceF1F3 = F3F1 = F2F4 = F4F2 = R2. Also, R2 com-mutes with every reflection. Thus the possible sub-groups have the form {id, R2, F1, F3}, {id, R2, F2, F4}.

    6.5. More examples of subgroups.

    (1) Let G = Z with usual addition. Then the set N is not asubgroup. Indeed, {0} / N.

    (2) G = R with addition. The set Z is a subgroup.(3) G = R \ {0} with multiplication. Let H = {x R : x2 Q}.

    Let us check that the group operation preserves the set: forany x, y H we have x2y2 Q, since x2, y2 Q. Now, 1 H,since 12 Q. For any x H we also have x2 Q, so thatthe inverse x1 is in H. This proves that H is a subgroup.

    (4) Again G = Z with addition. This is a group, generated by oneelement: n0 = 1. Indeed: n0 n0 = 2, n0 n0 n0 = 3 andnl0 = l. Similarly, n

    10 = 1, n01 n10 = 2 and nk0 = k.

    (5) G = R \ {0} with multiplication. The set H = {1, 1} is acyclic subgroup generated by one element: 1.

    (6) Let An = { Sn : () = 1}, i.e. the set of all even per-mutations of degree n. This is called the alternating group.Check that it is a group.

  • CHAPTER 2

    Elementary Number Theory

    1. Basic definitions and concepts

    We are working with the set of integers now: Z.

    1.1. Divisibility and primes.

    Definition 2.1. If m,n are integers and there is an integer k suchthat m = kn, then we say that n divides m (or more precisely, n is afactor, or divisor, of m ) and write n | m. If n is not a factor of m,then we say that n does not divide m and write n - m.

    An integer m > 1 whose divisors are only 1 and m, is called aprime number, or simply a prime.

    An integer n > 1 which is not a prime number, is said to be com-posite.

    Example: 2|6, 13|(39), but 3 - 10. Note: 0 | 0!

    Note some immediate consequences of the above definition:If b|a and c|b, then c|a.If b|a, then bc|ac for any c.If c|a and c|b, then c|ma+ nb for any m,n.

    A basic fact is the following division algorithm: for any pair ofnumbers, a Z, b 1, there exists a uniquely defined pair q, r suchthat

    (2.1) a = bq + r, 0 r < b.The number r is called the remainder of division of a by b.

    Theorem 2.2. Every positive integer, except 1, either is prime, orit can be represented as a product of primes. Moreover, this represen-tation is unique in the following sense: if

    n = pa11 pa22 pakk , aj > 0, j = 1, 2, . . . , k, p1 < p2 < < pk,

    with prime factors p1, p2, . . . , pk, and at the same time,

    n = qb11 qb22 qbss , bj > 0, j = 1, 2, . . . , s, q1 < q2 < < qs,

    19

  • 20 2. ELEMENTARY NUMBER THEORY

    with prime factors q1, q2, . . . , qs, then

    s = k; qj = pj, j = 1, 2, . . . , k, bj = aj, j = 1, 2, . . . , k.

    Proof. Let n > 1. If n is prime, then there is nothing to prove.Suppose that n has divisors between 1 and n. Let m be the least

    of these divisors. We claim that m is prime. Indeed, if it were not thecase, then there would be a number l : 1 < l < m such that l|m, whichwould imply that l|n. This contradicts the definition of m.

    Hence n is divisible, by a prime number, say p1:

    n = p1n1, 1 < n1 < n.

    Now, either n1 is prime, in which case the proof is complete, or it isdivisible by a prime < n1 , which we denote p2:

    n = p1p2n2, 1 < n2 < n1 < n.

    Repeating this procedure, we obtain a sequence of decreasing numbersn1, n2, . . . , nk, all greater than 1. At each step we have the alternativeof nk being prime or not. This sequence is finite, and hence at somestage nk will be prime, so that

    n1 = p1p2 pk,as required.

    The proof of uniqueness is omitted. Theorem 2.3. There are infinitely many primes.

    Proof. Suppose that there are only k primes p1, p2, . . . , pk, i.e. allnumbers starting with pk + 1 are composite. Let

    n = p1p2 pk + 1,and let p be a prime divisor of n. Then p is one of the numbersp1, p2, . . . pk, so that p | p1p2 pk. Since p | n, we have

    p | (n p1p2 pk).In other words, p | 1, but it is absurd, as p > 1. This proves theclaim.

    1.2. The greatest common divisor.

    Definition 2.4. If a, b are integers, then the largest number msuch that m | a and m | b, is called the greatest common divisor of a, b.Notation: d = (a, b).

    If (a, b) = 1, then the numbers a, b are called relatively prime, orcoprime.

  • 1. BASIC DEFINITIONS AND CONCEPTS 21

    In order to write out some useful formulae relating a, b and (a, b)we introduce the following set:

    S = S(a, b) = {xa+ yb, x Z, y Z}.Note that the set (S,+) is a group. In particular, the sum of any twonumbers from S is again in S. Note that a, b S.

    Lemma 2.5. Let c be the smallest positive number in S. Then

    S = {nc, n Z},or, in other words, any number m S has the form nc with somen Z.

    Proof. It is clear that all numbers nc are in the set S. Now weneed to show that S contains no other numbers. Suppose that it is notthe case, that is there exists a number m S which is not of the formnc. Then using (2.1) we can write

    m = nc+ r

    with some r : 0 < r < c. Since nc S and S is a group, the numberr = mnc is also in S. Since 0 < r < c, this contradicts the definitionof c. Thus r = 0, as claimed.

    Theorem 2.6. Let S = S(a, b) and c S be as defined above. Thenc = d := (a, b).

    Proof. By definition of c there are numbers n,m such that

    a = nc, b = mc,

    so that c | a and c | b. Thus c d. On the other hand, d | a and d | b,and hence d | xa + yb for any x, y, so d divides any element of S, andin particular c, which means that d c. Therefore d = c.

    Corollary 2.7. For any two integers a, b the equation

    ax+ by = n

    is solvable in integers x, y iff d | n.In particular,

    ax+ by = d

    is solvable.

    Using the above facts we prove the following theorem:

    Theorem 2.8 (Euclids first theorem). If p is a prime, and p | ab,then either p | a or p | b.

  • 22 2. ELEMENTARY NUMBER THEORY

    Proof. Suppose that p - a, so that (a, p) = 1. By Theorem 2.6,one can find numbers x, y such that

    xa+ yp = 1.

    Multiply by b:xab+ ypb = b.

    The left-hand side is divisible by p, and hence p | b, as claimed. 1.3. The Eucledian algorithm. Now we need to develop an al-

    gorithm which would allow us to find for any integers a, b

    their greatest common divisor d, numbers x, y such that xa+ yb = d.

    Examples: (6, 8) = 2, (8, 25) = 1, (7, 63) = 7.We use the division algorithm (2.1). Note that m | a and m | b iff

    m | b and m | r. Therefore (a, b) = (b, r). Thus we can use the abovealgorithm to find (a, b).

    Example (The Eucledian Algorithm). Find (24, 356). Write:

    356 = 14 24 + 20,so that (24, 356) = (24, 20). Continue:

    24 = 1 20 + 4,so that (24, 356) = (20, 4) = 4. Working backwards we see that

    4 = 24 1 20= 24 1 (356 14 24)= 15 24 1 356.

    Find (53, 77). Write:

    77 = 1 53 + 2453 = 2 24 + 5,24 = 4 5 + 4,5 = 1 4 + 1,

    and hence (77, 53) = (1, 4) = 1. Therefore the numbers 53 and 77 arerelatively prime. Again, working backwards, we find that

    1 = 5 1 4 = 5 1 (24 4 5)= 5 5 1 24 = 5 (53 2 24) 1 24= 5 53 11 24 = 5 53 11 (77 1 53)= 16 53 11 77.

  • 2. CONGRUENCES 23

    2. Congruences

    2.1. Congruency classes. Recall Definition 1.26. We have estab-lished earlier that the congruence mod n is an equivalence relation.Let us establish some elementary properties of congruent numbers.

    Lemma 2.9. If a a(n), b b(n), then a + b = a + b(n) andab = ab(n).

    Proof. By definition a = a + kn and b = b + mn with someinteger k,m. Thus

    (a+ b) (a + b) = n(k +m),and

    ab ab = a(b b) + b(a a) = amn+ bkn = n(am+ bk).The required relations follow.

    To understand these matters better, we need a bit of terminology.

    Definition 2.10. If a b mod n, then we say that b is a residueof a modulo n. If 0 b < n we say that b is the least residue (or theleast non-negative residue) of a modulo n.

    For any a Z we call [a] the class of residues congruent to a modulon.

    For example, 21 is a residue of 5 modulo 8, and at the same time 5is the least (non-negative) residue of 21 modulo 8.

    As a rule, when talking about congruencies, we use the least residues.So now we call the congruency classes the classes of residues. Let

    us show that every number a Z is congruent modulo n to one of thenumbers

    (2.2) 0, 1, 2, . . . , n 1.To see this we use the division algorithm, i.e. the fact that for anya Z there exists a unique pair q, r such that

    a = qn+ r, and 0 r < n.This identity shows that r a mod n and that r is defined uniquely.Now, since the list (2.2) does not contain congruent pairs, we concludethat there are exactly n distinct classes of residues modulo n.

    Definition 2.11. The set of n numbers a1, a2, . . . , an is said to bea complete system of residues modulo n, if every number from (2.2) iscongruent to one of the numbers al, l = 1, 2, . . . , n.

  • 24 2. ELEMENTARY NUMBER THEORY

    In particular, (2.2) is a complete system of residues modulo n. An-other example:

    1, 3, 4, . . . , n 1, n+ 2.In general, any collection of n incongruent numbers is a com-plete system of residues.

    Let us introduce a notation for the set of all classes of residuesmodulo n:

    Zn = Z/n = {[0], [1], . . . , [n 1]}.There are two operations on Zn: addition and multiplication, definedas follows:

    [k] + [m] = [k +m], [k][m] = [km],

    for any two numbers k,m. In words, in order to find the sum of twoclasses, you should take one representative from each class, add themup and take the equivalence class of the obtained number.

    For example, let [k] be the class of residues of the number k modulo5. If we want to find [3] + [4] we add 3 and 4, which gives 7. Thusthe answer is [7]. If we want to use the least non-negative residues, wecan write [2] instead of [7]. Thus [3] + [4] = [2]. In other words, in thelanguage of residues,

    3 + 4 2 mod 5.Similarly, [3] [4] = [12] = [2], i.e. 3 4 2 mod 5.

    2.2. Group properties. The pair (Zn,+) is a group: the identityelement is [0], the addition is associative, and every element [a] has aninverse: [a] = [na]. On the contrary, the pair (Zn,) is not a group,since at least one element does not have a multiplicative inverse: [0].However, there may be other elements without inverses, e.g. in Z4 theelement 2 does not have an inverse:

    [2] [0] = [0], [2] [1] = [2], [2] [2] = 0, [2] [3] = [2].So define

    Zn = (Zn,) = (Z/n),as the set of residue classes modulo n, which have multiplicative in-verses. For instance,

    Z4 = {[1], [3]}, [1]1 = [1], [3]1 = [3].Theorem 2.12. The set (Zn,) is a group.Proof. Let us show that multiplication is a binary operation. Let

    [a], [b] G = (Zn,), that is [a] and [b] have multiplicative inverses,which we call [a1] and [b1], so that aa1 1 mod n and bb1 1mod n. Thus (ab)(a1b1) 1 mod n, so [ab] G as well.

  • 2. CONGRUENCES 25

    Clearly, the multiplication is associative, and [1] is the identity el-ement. By definition of G every element has an inverse. Thus G is agroup.

    Let us find out when a number is invertible modulo n.

    Proposition 2.13. A number a is invertible modulo n iff a and nare relatively prime.

    Proof. Let d = (a, n). Suppose that ak 1 mod n, so thatak + nb = 1 with some b. By Corollary 2.7 this equation is solvable ink and b, iff d | 1, that is d = 1.

    Definition 2.14. The Euler totient function : N N isdefined by (n) = #Zn, i.e. (n) is the number of positive integers lessthan n, which are coprime to n.

    Example. (2) = 1, (3) = 2, (4) = 2, (5) = 4, (6) = 2.

    The easiest case is when p is a prime, so that every number lessthan p is relatively prime to p, so that

    Zp = {[1], [2], . . . , [p 1]}, (p) = p 1.Theorem 2.15 (Eulers generalisation of Fermats Little Theorem).

    Let a be coprime to n. Then a(n) 1 mod n.In particular, if p is a prime number, then ap1 1 mod p (Fer-

    mats Little Theorem).

    Proof. Since Zn is a group, and (n) is its order, by Corollary1.32 we have [a](n) = [1], i.e. a(n) 1 mod n.

    2.3. Linear congruences. Using the above Proposition we cansolve Linear congruences, i.e. equations of the type

    mx k mod n.To this end we need we need to find the inverse for m mod n. Thisis only possible if (n,m) = 1. The inverse is found from the formula1 = mp nq with some p, q, so we need to express (n,m) as a linearcombination of n and m.

    Example 2.16. (1) The congruence 5x 17(87) is resolvedas follows:

    87 = 5 17 + 2, 5 = 2 2 + 1,so

    1 = 5 2 2 = 5 2(87 5 17) = 35 5 2 87 = 35 5 mod 87.

  • 26 2. ELEMENTARY NUMBER THEORY

    This shows that 35 is the inverse of 5 mod 87, and hence

    x 35 17 595 73 mod 87.(2)

    2.4. Calculating the Euler function. If p is prime, then weknow that (p) = p 1. More generally, if n = pa with some prime p,then a natural number r is not comprime with n iff p | r. Therefore

    Zpa = Zpa \ {[0], [p], [2p], . . . , [pa p]}.This means that

    (pa) = pa pa1 = (p 1)pa1.To calculate (n) if n is not prime, we need the following notion:

    Definition 2.17. A function : N N is called multiplicative iffor any two coprime numbers m,n we have (mn) = (m)(n).

    We intend to show that Euler function is multiplicative. For thiswe need this intermediate result.

    Lemma 2.18. Suppose that (m,n) = 1. Let a1, a2, . . . , an be a com-plete set of residues modulo n, and let b1, b2, . . . , bm be a complete sys-tem of residues modulo m. Then the set

    (2.3) akm+ brn, k = 1, 2, . . . , n; r = 1, 2, . . . ,m,

    forms a complete system of residues modulo nm.

    Proof. There are exactly mn numbers of the form (2.3). It re-mains to check that the above set of numbers does not contain pairscongruent modulo mn. Assume that

    am+ bn am+ bn mod nm.This means that

    am am mod n, bn bn mod m.Since (m,n) = 1, by Theorem 2.8,

    a a mod n, b b mod m.Therefore the numbers (2.3) are all incongruent, as required.

    Theorem 2.19. The Euler function is multiplicative.

    Proof. We have already proved that the numbers (2.3) form acomplete system of residues modulo mn. let us count among them thenumbers coprime to nm. This means that we count numbers such that

    (am+ bn, nm) = 1,

  • 2. CONGRUENCES 27

    i.e., by Corollary 2.7,

    (am+ bn)p+ nmq = 1,

    with some p, q. Rewriting it as

    a(mp) + n(bp+ nq) = 1,

    we conclude that (a, n) = 1. Similarly, (b,m) = 1. There are exactly(n)(m) pairs a and b satisfying the above relations. This leads tothe required result.

    Now we can calculate (n) by factorising n as a product of primes:

    n = pa11 pa22 . . . p

    ass .

    The factors are pairwise coprime, so

    (n) = (pa11 )(pa22 ) (pass )

    = (p1 1)pa111 (p2 1)pa212 (ps 1)pas1s .Example. (1) (15) = (3 5) = 2 4 = 8.(2) (16) = (24) = 1 23 = 8.(3) (17) = 16.(4) (18) = (32 2) = 2 3 1 = 6.(5) (20) = (22 5) = 1 2 4 = 8.(6) (200) = (23 52) = 1 22 4 5 = 80.

    This recipe allows one to calculate large powers modulo n relying onTheorem 2.15. More precisely, we are able to find ak mod n assumingthat (a, n) = 1. For example, to find 71234 mod 15, find (15) = 8.Now, 1234 2 mod 8, and hence, by Theorem 2.15,

    71234 72 49 4 mod 15.Example. (1) Find 442 mod 25. First note: (4, 25) = 1 and

    (25) = 20, so 442 42 16 mod 25.(2) Find 764 mod 120. Note: (7, 120) = 1 and (120) = (23 3

    5) = 1 22 2 4 = 32, so 764 70 1 mod 120.Find 762 mod 120.Write 762 76472 72 491 mod 120.In order to find 491 mod 120 we run the Euclidean algo-

    rithm:

    120 = 2 49 + 22,49 =2 22 + 5,22 = 4 5 + 2,5 = 2 2 + 1.

  • 28 2. ELEMENTARY NUMBER THEORY

    Thus

    1 = 5 2 2 = 5 2 (22 4 5) = 9 5 2 22= 9 (49 2 22) 2 22 = 9 49 20 22= 9 49 20 (120 2 49) = 49 49 20 120 49 49 mod 120.

    Thus 491 49 mod 120. Thus 762 49 mod 120.(3) Find 16287 mod 765. Eulers function:

    (765) = (5 32 17) = 4 2 3 16 = 384.Note also that (16, 765) = 1 and

    16287 = 24287 = 21148 = 233844 24 mod 765.Now we need to find the inverse of 24 = 16 modulo 765. Write

    765 = 47 16 + 13, 16 = 1 13 + 3, 13 = 4 3 + 1.Thus

    1 = 13 4 3 = 13 4 (16 13) = 5 13 4 16= 5 (765 47 16) 4 16 = 5 765 239 16 239 16 mod 765,

    so that 161 mod 765 239 mod 765 526 mod 765.Hence 16287 526 mod 765.

    2.5. Higher congruences. Let us find out how to find solutionsto congruences or the type

    xa = b mod n.

    Theorem 2.20. Suppose that (n, b) = 1 and (a, (n)) = 1. Then

    the above congruence has a unique solution x ba1 mod n where a1is the inverse of a modulo (n)(!).

    Proof. Let us show first that x ba1 mod n is a solution:(ba

    1)a baa1 b1+k(n) b bk(n) b mod n.

    Conversely, since (b, n) = 1, any solution x will be coprime to n as well.Thus, writing aa1 = 1 mod (n), we can calculate

    xaa1

    = x1+k(n) = xxk(n) x mod n.On the other hand,

    xaa1 (xa)a1 ba1 mod n.

    Thus for any solution x ba1 mod n.

  • 2. CONGRUENCES 29

    Example. (1) Solve x3 3 mod 20. Write: 20 = 22 5, so(20) = 2 4 = 8. The number 3 is coprime to 8 and to 20.Thus the above theorem is applicable. Let us find first 31

    mod 8:8 = 2 3 + 2, 3 = 2 + 1,

    so that

    1 = 3 2 = 3 (8 2 3) = 3 3 8 3 3 mod 8,and hence 31 3 mod 8. Thus

    x 33 mod 20 27 mod 20 7 mod 20.(2) Solve x53 = 3 mod 200. To find (200) write 200 = 52 23, so

    that(200) = 4 5 1 22 = 80.

    Clearly, (53, 80) = 1, so that 53 is invertible modulo 80. Letus find the inverse:

    80 = 1 53 + 27, 53 = 1 27 + 26, 27 = 26 + 1.From here:

    1 = 27 26 = 27 (53 27) = 2 27 53= 2 (80 53) 53 = 2 80 3 53 3 53 mod 80,so 531 3 mod 80. By the above theorem

    x 33 271 mod 200.Again, need to find the inverse:

    200 = 7 27 + 11, 27 = 2 11 + 5, 11 = 2 5 + 1.Thus

    1 = 11 2 5 = 11 2 (27 2 11) = 5 11 2 27= 5 (200 7 27) 2 27 = 5 200 37 27 37 27 mod 200,

    Consequently, x 37 163(200).2.6. Public key cryptography. Here is a way of encoding and

    decoding messages. Choose two primes p, q. Choose a number a co-prime to (pq) and compute the inverse b of a modulo (pq). Then thepair {pq, a} is the public key and {pq, b} is the private key. Themessages are encoded using the public key in the following way. Themessage is a number M modulo pq. The encoded message is N Mamod pq, which is easily found using the algorithms developed in thelectures, using the public key. The message is recovered by calculat-ing M N b mod pq, if one knows b. However, finding b is a serious

  • 30 2. ELEMENTARY NUMBER THEORY

    problem, if only the public key is given, since one needs to determine(pq), which is done by finding p and q first. As there is no sensiblerecipe how to find the factors p, q from the product pq, this procedureis virtually impossible to implement.

    Example. Let p = 3, q = 11. Then (pq) = (p 1)(q 1) = 20.Take {33, 7} as the public key. The inverse of 7 modulo 20 is 3, so theprivate key is {33, 3}. Suppose that we are sending the message whichis number 2. Then the coded message is 27 = 128 4 mod 33. Todecode the message calculate (4)3 = 64 2 mod 33.

  • CHAPTER 3

    Linear Algebra

    1. Basic definitions and concepts

    1.1. Addition and multiplication. A real matrix A of size nmis a nm array of real numbers:

    A =

    a11 a12 . . . a1ma21 a22 . . . a2m...

    .... . .

    ...an1 an2 . . . anm

    Examples:

    A =

    [1 7 42 11 3

    ], B =

    [0 62 3

    ].

    We denote the set of all matrices of size n m by M(n,m). Let usdefine addition and multiplication on the set of matrices. Addition isdefined for the matrices of the same size:

    Definition 3.1 (Addition). Let A,B be two nm matrices withentries ajk and bjk respectively. Then the matrix C = A+B is definedas having the entries cjk = ajk + bjk:

    a11 a12 . . . a1ma21 a22 . . . a2m...

    .... . .

    ...an1 an2 . . . anm

    +b11 b12 . . . b1mb21 b22 . . . b2m...

    .... . .

    ...bn1 bn2 . . . bnm

    =

    a11 + b11 a12 + b12 . . . a1m + b1ma21 + b21 a22 + b22 . . . a2m + b2m

    ......

    . . ....

    an1 + bn1 an2 + bn2 . . . anm + bnm

    .Example:[

    1 7 42 11 10

    ]+

    [0 2 59 3 3

    ]=

    [1 5 911 14 7

    ].

    31

  • 32 3. LINEAR ALGEBRA

    Definition 3.2 (Multiplication). Let A be an n m matrix andlet B be an mk matrix. Then the product AB is defined as the nkmatrix with entries

    cjl =ms=1

    ajsbsl

    Note that the number of columns of matrix A coincides with thenumber of rows in matrix B.

    Examples: [1 3 02 0 6

    ]0 51 22 0

    = [ 3 1112 10

    ],

    [1 32 4

    ] [3 25 6

    ]=

    [18 1614 28

    ],

    [2 0 34 1 5

    ] 7 1 4 72 5 0 43 1 2 3

    = [23 5 2 515 6 26 39

    ].

    If we multiply two n n matrices, we get another n n matrix! If thematrices A and B have the right sizes to form the product AB, it is notalways possible to define the product BA. However, if it is possible, itis not always true that AB = BA. Counterexample:[

    1 10 1

    ] [1 01 1

    ]=

    [2 11 1

    ],

    [1 01 1

    ] [1 10 1

    ]=

    [1 11 2

    ].

    Define the n n identity matrix:

    In =

    1 0 . . . 00 1 . . . 0...

    .... . .

    ...0 0 . . . 1

    ,or, in other words, ajl = jl, where jl is the so-called Kroneker sym-bol:

    jl =

    {1, j = l,

    0, j 6= l .

  • 1. BASIC DEFINITIONS AND CONCEPTS 33

    Define also multiplication by a real number (scalar):

    A =

    a11 a12 a1na21 a22 a2n

    ......

    . . ....

    an1 an2 ann

    , R.In other words,

    A =

    0 . . . 00 . . . 0...

    .... . .

    ...0 0 . . .

    a11 a12 a1na21 a22 a2n...

    .... . .

    ...an1 an2 ann

    , R.This follows from the following lemma:

    Lemma 3.3. Let A M(n, n). Then AIn = InA = A.Proof. Let B = AIn. By definition of the identity matrix, we

    have

    bjl =ns=1

    ajssl.

    By definition of the Kroneker symbol, only one term is distinct fromzero: ll, so that bjl = ajl. Similarly for InA.

    Let us establish some useful properties of addition and multiplica-tion:

    Proposition 3.4. Let A M(n,m), B,C M(m, p), D M(p, q). Then

    (1) (AB)D = A(BD), i.e. multiplication is associative,(2) A(B +C) = AB +AC, (B +C)D = BD+CD, i.e. multipli-

    cation is distributive.

    Proof. (1) By definition, the entry tjl of the matrix T =(AB)D is

    tjl =

    pk=1

    ( ms=1

    ajsbsk)dkl =

    ms=1

    ajs( pk=1

    bskdkl),

    which coincides with the corresponding entry of the matrixA(BD).

    (2) Let S = A(B + C) and E = AB + AC, so that

    sjl =ms=1

    ajs(bsl + csl) =ms=1

    ajsbsl +ms=1

    ajscsl = ejl.

  • 34 3. LINEAR ALGEBRA

    The second identity is proved in the same way.

    1.2. Structure of the set of matrices. Now we can investigatethe structure of the set M(n, n). Is it a group for the addition (ormultiplication) operation? The pair (M(n, n),+) is certainly a group.Indeed, define the zero matrix O by ojl = 0. Then it will be the identityelement for the group. Every element A M(n, n) has an inverse: thematrix A with entries ajl. Notice that OA = O for any A.

    Consider now the set of matricesM =M(n, n) with multiplication.For a matrix A define the multiplicative inverse as a matrix B Msuch that AB = BA = In. Notation: B = A

    1. This definition showsthat B is also invertible and B1 = A.

    A natural question is whether all matrices A M have inverses(or are invertible). In other words, is the pair (M,) a group? Itis immediately clear, that it is not, since some matrices do not haveinverses, for instance, the zero matrix. Examples of invertible matrices:In, the diagonal matrices

    A =

    a11 0 . . . 00 a22 . . . 0...

    .... . .

    ...0 0 . . . ann

    with non-zero entries. Indeed, the matrix

    B =

    a111 0 . . . 00 a122 . . . 0...

    .... . .

    ...0 0 . . . a1nn

    is the inverse to A.

    Example. (1)

    A =

    5 0 00 3 00 0 7

    , A1 =1/5 0 00 1/3 0

    0 0 1/7

    .(2)

    A =

    [2 41 3

    ], A1 =

    [3/10 4/101/10 2/10

    ].

    (3) Arbitrary matrix T M(2, 2) is invertible if t11t22 t12t21 6= 0and

    T1 =1

    t11t22 t12t21

    [t22 t12t21 t11

    ].

  • 2. MATRICES AND LINEAR MAPS 35

    Let G = GL(n,R) M(n, n) be the subset of all invertible matri-ces. Claim: G is a group. Let us check first that multiplication definesa binary operation on G, that is for any two matrices A,B G we alsohave AB G:

    Proposition 3.5. If A,B GL(n,R), then AB GL(n,R) and(AB)1 = B1A1.

    Proof. Denote T = AB and S = B1A1. We want to showthat S = T1. According to definition, for this we need to check thatST = TS = In. Indeed:

    ST = B1A1AB = B1InB = B1B = In,

    TS = ABB1A1 = AInA1 = AA1 = In.

    We also know the following:

    (1) Multiplication is associative;(2) The element In is the identity;(3) Every element has an inverse.

    Thus GL(n,R) is a group. This group is called the general lineargroup of n n-matrices.

    2. Matrices and linear maps

    2.1. Vectors, Linear Maps. Denote by Rn the product R R R n

    .

    Instead of writing the elements of Rn as (x1, x2, . . . , xn) we write themas columns

    x =

    x1x2...xn

    .The elements of Rn are called vectors. One can view them as elementsof M(n, 1), that is matrices with one column and n rows. Thereforethey have natural operations defined: addition and multiplication by

  • 36 3. LINEAR ALGEBRA

    m n- matrices:x1x2...xn

    +

    y1y2...yn

    =

    x1 + y1x2 + y2

    ...xn + yn

    ,

    a11 a12 a1na21 a22 a2n...

    .... . .

    ...am1 am2 amn

    x1x2...xn

    =

    a11x1 + a12x2 + + a1nxna21x1 + a22x2 + + a2nxn

    ...am1x1 + am2x2 + + amnxn

    .In particular, we can multiply the vectors by scalars:

    x1x2...xn

    =

    x1x2

    ...xn

    , R.Define the vectors

    e1 =

    10...0

    , e2 =

    01...0

    , . . . , en =

    00...1

    .Every vector x can be written as a linear combination

    x = x1e1 + x2e2 + + xnen.Among the functions(maps) defined on the set Rn we single out the setof linear maps:

    Definition 3.6. A function T : Rn Rm is called a linear map ifit satisfies the properties

    T (x + y) = T (x) + T (y),

    T (x) = T (x),

    for all x,y Rn and R.A remarkable fact is that every linear map can be represented by a

    matrix:

    Theorem 3.7. Let T : Rn Rm be a linear map. Then T (x) =Ax for all x Rn where A is a matrix from M(m,n) defined by(T (e1), T (e2), . . . , T (en)) .

  • 3. LINEAR SYSTEMS 37

    Proof. Write

    T (x) = T (nj=1

    xjej) =nj=1

    xjT (ej) = (T (e1), T (e2), . . . , T (en))

    x1x2...xn

    .

    3. Linear systems

    The concept of matrices is useful when studying the following sys-tem of linear equations:

    a11x1 + + a1mxm = b1,a21x1 + + a2mxm = b2,. . .an1x1 + + anmxm = bn,

    where b1, b2, . . . , bn are fixed real numbers, and x1, x2, . . . , xn are un-known. They are sometimes called simultaneous equations. It is con-venient to rewrite this system using the matrix notation:

    A =

    a11 a12 a1ma21 a22 a2m...

    .... . .

    ...an1 an2 anm

    , x =

    x1x2...xm

    , b =

    b1b2...bn

    ,so that

    Ax = b.

    To solve the system define the augmented matrix:

    A =

    a11 a12 a1m b1a21 a22 a2m b2...

    .... . .

    ......

    an1 an2 anm bn

    Now well perform a number of transformation which preserve the so-lutions. We will

    Swap two equations, Multiply an equation by a non-zero number, Add one equation to another.

    These operations correspond to the following ones on the augmentedmatrix:

    Swap two rows, Multiply a row by a non-zero number,

  • 38 3. LINEAR ALGEBRA

    Add one row to another.These are called elementary row operations

    Example. (1) Consider the system{x+ y = 3,2x+ 3y = 7,

    A =(

    1 1 32 3 7

    ).

    Thus(1 1 32 3 7

    )R2 2R1

    (1 1 30 1 1

    )R1R2

    (1 0 20 1 1

    ).

    Consequently, x1 = 2, x2 = 1.(2) Consider the system 2x1 + 4x2 + 6x3 = 18,4x1 + 5x2 + 6x3 = 24,3x1 + x2 2x3 = 4 A =

    2 4 6 184 5 6 243 1 2 4

    .Thus 2 4 6 184 5 6 24

    3 1 2 4

    R1/2 1 2 3 94 5 6 24

    3 1 2 4

    R2 4R1R3 3R1

    1 2 3 90 3 6 120 5 11 23

    R2/3 1 2 3 90 1 2 4

    0 5 11 23

    R1 2R2R3 + 5R2

    1 0 1 10 1 2 40 0 1 3

    R3 1 0 1 10 1 2 4

    0 0 1 3

    R1 +R3R2 2R3

    1 0 0 40 1 0 20 0 1 3

    Consequently, x1 = 4, x2 = 2, x3 = 3.

    (3) An example with infinitely many solutions. Consider the sys-tem 2x1 + 4x2 + 6x3 = 18,4x1 + 5x2 + 6x3 = 24,2x2 + 7x2 + 12x3 = 30 A =

    2 4 6 184 5 6 242 7 12 30

    .

  • 3. LINEAR SYSTEMS 39

    Thus 1 2 3 94 5 6 242 7 12 30

    R2 4R1R3 2R1

    1 2 3 90 3 6 120 3 6 12

    R2/3, R3/3 1 2 3 90 1 2 4

    0 1 2 4

    R1 2R2R3R2

    1 0 1 10 1 2 40 0 0 0

    ,so x1 x3 = 1, x2 + 2x3 = 4, and hence the solutions is (1 + x3, 4

    2x3, x3). This vector is a solution for any x3 R. Thus it is called thegeneral solution.

    To make it formal:

    Definition 3.8. An entry akl is said to be a leading one, if it is thefirst non-zero entry in the row k, i.e. akj = 0 for all j < l and akl 6= 0.

    We say that a matrix A is in row echelon form for every leadingentry akl we have:

    (1) akl = 1,(2) akj = 0, j < l,(3) anl = 0, n > k.

    In other words, every leading element has zeros on the left and under-neath.

    Theorem 3.9. Every matrix can be reduced to the row echelon formby elementary row operations.

    Without proof.The process of reduction to this form is called Gaussian elimination.This method is also useful if one needs to know what kind of map

    a matrix defines. More precisely, how to determine if the linear mapdefined by a matrix A is injective?

    Lemma 3.10. The linear map A : Rm Rn is injective if theequation Ax = 0 has only one solution: x = 0.

    Proof. Suppose that Ax1 = b, Ax2 = b with some b,x1,x2. Thendue to the linearity

    A(x1 x2) = 0.If the equation Ax = 0 has a unique solution x = 0, then x1 = x2, sothe map is injective.

  • 40 3. LINEAR ALGEBRA

    Example. Is the map defined by the matrix

    A =

    (1 12 3

    )A in the previous example injective? The answer is YES, as the equa-tion Ax = 0 has only one solution x = 0.

    Example. Find a connection between the parameters a, b and c,which ensures that the system:

    x+ 2y 3z = a3x y + 2z = bx 5y + 8z = c

    has:

    (1) a unique solution.(2) infinitely many solutions.(3) no solution.

    Let us reduce the extended matrix: 1 2 3 a3 1 2 b1 5 8 c

    R2 := 3R1R2R3 := R1R3

    1 2 3 a0 7 11 3a b0 7 11 a c

    R3 := R3R2

    1 2 3 a0 7 11 3a b0 0 0 2a+ b c

    If 2a + b c = 0, then the system has infinitely many solutions. If2a+ b c 6= 0, there are no solutions. The system never has a uniquesolution.

    4. Inverting square matrices

    4.1. Inverting matrices. Analysing again what we did with thelinear systems, we realise that all we did was to invert the matrixA. Indeed, to find x from the equation Ax = b we can simply writex = A1b. On the other hand, after reducing the augmented matrix tothe row echelon form we quickly arrive at the augmented matrix of theform (Inc), so that = A

    1b. This suggests that we can find the matrixinverse by applying the elementary row operations to the augmentedn 2n-matrix of the form (AIn) until we get the matrix of the form(InB). Then B = A

    1.

  • 4. INVERTING SQUARE MATRICES 41

    Example. (1) Let

    A =

    (2 34 5

    ).

    Write

    (2 3 1 04 5 0 1

    )R2 + 2R1

    (2 3 1 00 1 2 1

    )

    R1 3R2

    (2 0 5 30 1 2 1

    )R1/2R2

    (1 0 5/2 3/20 1 2 1

    ),

    so that

    A1 =( 5/2 3/22 1

    ).

    (2) Here is a matrix which is not invertible:

    A =

    (1 22 4

    ).

    Indeed,

    (1 2 1 02 4 0 1

    )R2 + 2R1

    (1 2 1 00 0 2 1

    ).

    (3) Let

    A =

    1 1 10 2 35 5 1

    .

  • 42 3. LINEAR ALGEBRA

    Write 1 1 1 1 0 00 2 3 0 1 05 5 1 0 0 1

    R3 5R1 1 1 1 1 0 00 2 3 0 1 0

    0 0 4 5 0 1

    4R2 + 3R3

    1 1 1 1 0 00 8 0 15 4 30 0 4 5 0 1

    4R1 +R3 4 4 0 1 0 10 8 0 15 4 3

    0 0 4 5 0 1

    R1R2/2

    4 0 0 13/2 2 1/20 8 0 15 4 30 0 4 5 0 1

    R1/4, R2/8,R3/4

    1 0 0 13/8 1/2 1/80 1 0 15/8 1/2 3/80 0 1 5/4 0 1/4

    .5. Determinants

    5.1. Definition.

    Definition 3.11. The determinant of an nn-matrix A is definedto be

    detA =Sn

    ()a1(1)a2(2) an(n),

    where the summation is taken over all permutations of degree n, and() is the signature (sign) of the permutation.

    Equivalent definition:

    detA =Sn

    ()a(1)1a(2)2 a(n)n.

    Let us note a few simple properties of this number.

    (1) If one swaps two rows(columns), detA gets multiplied by 1;(2) If a row(column) is multiplied by R, then detA is multi-

    plied by ;(3) The determinant does not change if a multiple of one row

    (column) is added to another one.

    Definition 3.12. The transpose of the matrix A M(n, n) isanother matrix AT , with the entries aTjk = akj.

    Another property of det is that

    detA = detAT .

  • 5. DETERMINANTS 43

    Computation of the determinant is not easy, if the matrix is large.There are a few tricks which can simplify the task.

    Observe first of all that for any 2 2 matrix we havedetA = a11a22 a12a21.

    If the matrix has a special form, we sometimes can do it easily.

    Definition 3.13. We call a matrix T upper (lower) triangular iftjk = 0 for all j > k (j < k).

    If A is lower or upper triangular, then the determinant is easilyfound. For instance, assume that

    A =

    1 . . . 0 2 . . . ...

    .... . .

    ...0 0 . . . n

    with some 1, 2, . . . , n on the diagonal. Then

    detA = 12 n.In particular, this applies to diagonal matrices.

    Using the above facts we conclude that one way of finding the de-terminant is to reduce the matrix to the triangular form using theelementary row (or column) operations.

    Example. (1) Let

    A =

    2 1 40 1 56 3 4

    .Then 2 1 40 1 5

    6 3 4

    R3 3R1 2 1 40 1 5

    0 6 16

    R3 6R2 2 1 40 1 5

    0 0 46

    .According to our previous observations, the determinant doesnot change under these operations, so that detA = 92.

    (2) Let

    A =

    1 3 24 5 12 4 3

    .

  • 44 3. LINEAR ALGEBRA

    Then

    det

    1 3 24 5 12 4 3

    = det 1 3 20 7 7

    0 2 1

    = 7 det 1 3 20 1 1

    0 2 1

    = 7 det 1 3 20 1 1

    0 0 1

    = 7.Comparing this procedure with the redaction to the row echelon

    form, we deduce the following result:

    Theorem 3.14. The matrix is invertible iff detA 6= 0.How to find the determinant of the inverse matrix? One uses the

    following important property:

    Proposition 3.15. For any matrices A,B M(n, n) we havedet(AB) = detA detB.

    Without proof.If A is invertible, then

    detA1 =1

    detA.

    This immediately follows from the above Proposition in view of theidentities AA1 = In and det In = 1.

    5.2. Minors and co-factors. Let A M(n, n) and let Mjk M(n1, n1) be the matrix obtained from A by deleting from A thejth row and kth column.

    Definition 3.16. The matrix Mjk is called the jkth minor of thematrix A.

    The numberAjk = (1)j+k detMjk

    is called the jkth co-factor of A.

    Example. (1)

    A =

    2 1 40 1 56 3 4

    .Then

    M13 =

    (0 16 3

    ), M11 =

    (1 53 4

    ), M32 =

    (2 40 5

    ).

  • 5. DETERMINANTS 45

    The co-factors are

    A13 = 6, A11 = 19, A32 = 10.(2) For a 2 2 matrix

    A11 = a22, A12 = a21, A21 = a12, A22 = a11.These notions are very useful for calculating the determinant of A:

    Proposition 3.17. Let A M(n, n). Then for any j, l = 1, 2, . . . , n:

    detA =nk=1

    ajkAjk

    =nk=1

    aklAkl.

    These formulae are called the expansion of the determinant in the jthrow and the lth column respectively.

    Example. Let

    A =

    2 1 40 1 56 3 4

    .Let us expand the determinant in the first row:

    detA = 2A11 1A12 + 4A13,with

    A11 = det

    (1 53 4

    )= 19, A12 = det

    (0 56 4

    )= 30,

    A13 = det

    (0 16 3

    )= 6,

    so thatdetA = 2 (19) 1 30 + 4 (6) = 92.

    Alternatively, we may notice that the expansion in the second row willhave only two co-factors:

    detA = A22 + 5A23,

    with

    A22 = det

    (2 46 4

    )= 32, A23 = det

    (2 16 3

    )= 12,

    so thatdetA = 32 + 5 (12) = 92.

  • 46 3. LINEAR ALGEBRA

    A general observation: if a matrix has a column or a row of zeros,then the determinant equals zero.

    5.3. The adjoint matrix and another way of finding theinverse.

    Definition 3.18. The adjoint matrix for a given A M(n, n) isthe matrix ad(A) M(n, n) of the form

    A11 A21 . . . An1A12 A22 . . . An2

    ......

    . . ....

    A1n A2n . . . Ann

    .In other words, the jkth entry of ad(A) is Akj, i.e. it is the transposeof the matrix built of Ajks.

    Example. n = 2:

    A =

    (a11 a12a21 a22

    ), ad(A) =

    (a22 a12a21 a11

    ).

    The inverse of a matrix A is found as

    A1 =1

    detAad(A).

    In the two-dimensional case it gives

    A1 =1

    a11a22 a12a21

    (a22 a12a21 a11

    ).

    6. Eigenvalues and eigenvectors

    Our aim now is to introduce the method called diagonalisation. Itconsists in finding for a matrix A another invertible matrix M suchthat A = MDM1 with a diagonal matrix D, that is djk = 0 if j 6=k. Note that not all matrices can be represented in this form. Thisrepresentation is convenient, since allows one to calculate easily thepowers of matrices. For instance, if

    A =

    1 0 . . . 00 2 . . . 0...

    .... . .

    ...0 0 . . . n

    ,

  • 6. EIGENVALUES AND EIGENVECTORS 47

    Then for any l = 1, 2, . . . , one has

    Al =

    l1 0 . . . 00 l2 . . . 0...

    .... . .

    ...0 0 . . . ln

    .For example, 2 0 00 3 0

    0 0 4

    l = 2l 0 00 3l 0

    0 0 (4)l

    .In general, for diagonal matrices the elementary matrix operations be-come simpler.

    If A = MDM1, then

    Al = MDM1 MDM1 . . .MDM1 = MDlM1,

    so taking the power of A boils down to taking the power of a diagonalmatrix!

    Definition 3.19. A real number is called an eigenvalue of A ifthere exists a non-zero vector v such that Av = v.

    A matrix may have more than one eigenvalue.Observe straight away that if v is an eigenvector, then tv is also an

    eigenvector for any non-zero t R.Example. (1) A = In. For any vector v 6= 0 we have Av = v,

    so = 1 is the only eigenvalue of In and any non-zero vectoris an eigenvector.

    (2) Let

    A =

    (10 186 11

    ), v =

    (21

    ),

    so that

    Av =

    (10 186 11

    )(21

    )=

    (21

    ).

    Thus = 1 is an eigenvalue and v is an eigenvector.

    The procedure for finding eigenvalues and eigenvectors is simple:suppose that Av = v, so that (AIn)v = 0. Since we are looking fora non-zero solution of this system, the matrix AIn is not invertible,i.e. det(A In) = 0. Conversely, if the determinant equals zero, thismeans that the system has either no or infinitely many solutions. Sincev = 0 is a solution, this means that one has infinitely many of them,

  • 48 3. LINEAR ALGEBRA

    and hence there exists a vector v 6= 0 such that Av = v. This meansthat is an eigenvalue and we have proved the following result:

    Proposition 3.20. Eigenvalues are precisely the solutions of theequation det(A In) = 0.

    The above equation is an algebraic equation for the roots of annth order polynomial. Thus there are exactly n roots, and some ofthem are real. Suppose all of them are real: 1, 2, . . . , n. To findthe eigenvectors vj associated with them, need to solve the systemsAx = jx, j = 1, 2, . . . , n. Then we form the matrix

    M =(

    v1 v2 . . . vn).

    Then we have

    A = MDM1, D =

    1 0 . . . 00 2 . . . 0...

    .... . .

    ...0 0 . . . n

    .Example. (1) Let

    A =

    (1 23 2

    ).

    Write the equation:

    det(A I2) = det(

    1 23 2

    )= (1 )(2 ) 6

    = 2 3 4 = ( 4)(+ 1).Thus we have two eigenvalues: 1 = 1, 2 = 4.

    Find the eigenvector for 1 = 1:

    (A+ I2)x =

    (2 23 3

    )(x1x2

    )=

    (00

    ).

    Thus x1 + x2 = 0, whence

    v1 =

    (x1x1

    )= x1

    (11

    ).

    Find the eigenvector for 1 = 1:

    (A 4I2)x =( 3 2

    3 2)(

    x1x2

    )=

    (00

    ).

  • 6. EIGENVALUES AND EIGENVECTORS 49

    Thus 3x1 2x2 = 0, whence

    v2 =

    (2x13x1

    )= x1

    (23

    ).

    Define the matrix M as follows:

    M =

    (1 21 3

    ).

    Note:

    M1 =1

    5

    (3 21 1

    ).

    Then the matrix A can be now represented as MDM1. Letus check:

    MDM1 =1

    5

    (1 21 3

    )( 1 00 4

    )(3 21 1

    )

    =1

    5

    (1 21 3

    )( 3 24 4

    )=

    (1 23 2

    ).

    As we have observed previously,

    Al = MDlM1 =1

    5

    (1 21 3

    )( 1 00 4

    )l(3 21 1

    )

    =1

    5

    (1 21 3

    )((1)l 0

    0 4l

    )(3 21 1

    )

    =(1)l

    5

    (1 21 3

    )(1 00 0

    )(3 21 1

    )

    +4l

    5

    (1 21 3

    )(0 00 1

    )(3 21 1

    )

    =(1)l

    5

    (1 01 0

    )(3 21 1

    )+

    4l

    5

    (0 20 3

    )(3 21 1

    )

    =(1)l

    5

    (3 23 2

    )+

    4l

    5

    (2 23 3

    ).

  • 50 3. LINEAR ALGEBRA

    (2) Here is matrix which cannot be diagonalised:

    B =

    (2 10 2

    )Indeed, the equation det(B I2) = 0 looks as follows: (2 )2 = 0, so that = 2 is an eigenvalue. Does it have twoeigenvectors associated with it? Let us find out:(

    0 10 0

    )(x1x2

    )= 0,

    so x2 = 0, and x1 takes an arbitrary value. Therefore theeigenvectors look like(

    x10

    )= x1

    (10

    ).

    Thus if we take the matrix

    M =

    (1 10 0

    ),

    to attempt diagonalisation, it will not be invertible!(3) Let us find eigenvalues of a 3 3- matrix:

    A =

    1 1 43 2 12 1 1

    .Write the equation:

    det(A I3) = det 1 1 43 2 1

    2 1 1

    = (1 ) det(

    2 11 1

    )+ det

    (3 12 1

    )+ 4 det

    (3 2 2 1

    )Thus

    det(A I3) = (1 )[(2 )(1 + ) + 1] 3(1 + ) + 2 + 4[3 2(2 )]

    = (1 )(2 1) 3 5 + 8= (1 )(2 1) + 5( 1)= ( 1)(2 6) = ( 1)(+ 2)( 3).

  • 6. EIGENVALUES AND EIGENVECTORS 51

    Thus the eigenvalues are 1 = 1, 2 = 2, 3 = 3.Eigenvector for 1 = 1: 0 1 4 03 1 1 0

    2 1 2 0

    R2 +R1R3 +R1

    0 1 4 03 0 3 02 0 2 0

    R3/2R2/3

    R2/3

    0 1 4 01 0 1 00 0 0 0

    .Thus x1 = x3 and x2 = 4x3, so

    v1 =

    141

    .Eigenvector for 2 = 2: 3 1 4 03 4 1 02 1 1 0

    R2R13R3 2R1

    3 1 4 00 5 5 00 5 5 0

    R3R2R2/5

    3 1 4 00 1 1 00 0 0 0

    R1 +R2 3 0 3 00 1 1 0

    0 0 0 0

    .Therefore x1 = x3, x2 = x3, and hence

    v2 =

    111

    .Eigenvector for 3 = 3: 2 1 4 03 1 1 0

    2 1 4 0

    2R2 + 3R1R3 +R1

    2 1 4 00 5 10 00 0 0 0

    R1R2/5

    2 1 4 00 1 2 00 0 0 0

    .

  • 52 3. LINEAR ALGEBRA

    Therefore x2 = 2x3, 2x1 = 4x3 x2 = 2x3, and hence

    v3 =

    121

    .