site.iugaza.edu.pssite.iugaza.edu.ps/biqelan/files/2011/04/reglec31.pdf · overview the yield y iof...

265
Regression Analysis Chapter 3: Simple Linear Regression Matrix Version Dr. Bisher Mamoun Iqelan [email protected] Department of Mathematics The Islamic University of Gaza 2010-2011, Semester 2 Dr. Bisher M. Iqelan (Department of Math.) 3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 1 / 77

Upload: others

Post on 24-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Regression AnalysisChapter 3: Simple Linear Regression

    Matrix Version

    Dr. Bisher Mamoun [email protected]

    Department of MathematicsThe Islamic University of Gaza

    2010-2011, Semester 2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 1 / 77

  • Overview

    The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that

    Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)

    where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2

    for all i, and the Yi’s are uncorrelated.

    The system of equations (1) can be written in matrix form as

    Y = Xβ + E ,

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77

  • Overview

    The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that

    Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)

    where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2

    for all i, and the Yi’s are uncorrelated.The system of equations (1) can be written in matrix form as

    Y = Xβ + E ,

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77

  • Overview

    The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that

    Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)

    where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2

    for all i, and the Yi’s are uncorrelated.The system of equations (1) can be written in matrix form as

    Y = Xβ + E ,

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77

  • Review of Matrices

    I A matrix: a rectangular array of elements arranged in rows andcolumns

    I an example

    Column Column1 2

    Row 1Row 2Row 3

    10 20100 2001000 2000

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77

  • Review of Matrices

    I A matrix: a rectangular array of elements arranged in rows andcolumns

    I an example

    Column Column1 2

    Row 1Row 2Row 3

    10 20100 2001000 2000

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77

  • Review of Matrices

    I A matrix: a rectangular array of elements arranged in rows andcolumns

    I an example

    Column Column1 2

    Row 1Row 2Row 3

    10 20100 2001000 2000

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77

  • A matrix with r rows and c columns

    I

    A =

    a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c

    ......

    ......

    ......

    ai1 ai2 . . . aij . . . aic...

    ......

    ......

    ...ar1 ar2 . . . arj . . . arc

    I Sometimes, in short notations we denote it as

    A = [aij ] i = 1, ..., r; j = 1, ..., c

    I r and c are called the dimension of a matrix

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77

  • A matrix with r rows and c columns

    I

    A =

    a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c

    ......

    ......

    ......

    ai1 ai2 . . . aij . . . aic...

    ......

    ......

    ...ar1 ar2 . . . arj . . . arc

    I Sometimes, in short notations we denote it as

    A = [aij ] i = 1, ..., r; j = 1, ..., c

    I r and c are called the dimension of a matrix

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77

  • A matrix with r rows and c columns

    I

    A =

    a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c

    ......

    ......

    ......

    ai1 ai2 . . . aij . . . aic...

    ......

    ......

    ...ar1 ar2 . . . arj . . . arc

    I Sometimes, in short notations we denote it as

    A = [aij ] i = 1, ..., r; j = 1, ..., c

    I r and c are called the dimension of a matrix

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77

  • A matrix with r rows and c columns

    I

    A =

    a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c

    ......

    ......

    ......

    ai1 ai2 . . . aij . . . aic...

    ......

    ......

    ...ar1 ar2 . . . arj . . . arc

    I Sometimes, in short notations we denote it as

    A = [aij ] i = 1, ..., r; j = 1, ..., c

    I r and c are called the dimension of a matrix

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77

  • Square matrix and Vector

    I Square matrix: equal number of rows and columns

    (1 75 −2

    ),

    a11 a12 a13a21 a22 a23a31 a32 a33

    I Vector: matrix with only one row or one column

    A =(

    3 −2 5), B =

    2610

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77

  • Square matrix and Vector

    I Square matrix: equal number of rows and columns

    (1 75 −2

    ),

    a11 a12 a13a21 a22 a23a31 a32 a33

    I Vector: matrix with only one row or one column

    A =(

    3 −2 5), B =

    2610

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77

  • Square matrix and Vector

    I Square matrix: equal number of rows and columns

    (1 75 −2

    ),

    a11 a12 a13a21 a22 a23a31 a32 a33

    I Vector: matrix with only one row or one column

    A =(

    3 −2 5), B =

    2610

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77

  • Transpose of a matrix and equality of matrices

    I Transpose of a matrix A is another matrix denoted by AT

    A =

    2 5−4 06 1

    , AT = ( 2 −4 65 0 1

    )

    I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example

    A =

    a11 a12a21 a22a31 a32

    , B = 12 50−3 10

    16 21

    If A = B, then a11 = 12, a12 = 50, and so on.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77

  • Transpose of a matrix and equality of matrices

    I Transpose of a matrix A is another matrix denoted by AT

    A =

    2 5−4 06 1

    , AT = ( 2 −4 65 0 1

    )

    I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example

    A =

    a11 a12a21 a22a31 a32

    , B = 12 50−3 10

    16 21

    If A = B, then a11 = 12, a12 = 50, and so on.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77

  • Transpose of a matrix and equality of matrices

    I Transpose of a matrix A is another matrix denoted by AT

    A =

    2 5−4 06 1

    , AT = ( 2 −4 65 0 1

    )

    I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example

    A =

    a11 a12a21 a22a31 a32

    , B = 12 50−3 10

    16 21

    If A = B, then a11 = 12, a12 = 50, and so on.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77

  • Transpose of a matrix and equality of matrices

    I Transpose of a matrix A is another matrix denoted by AT

    A =

    2 5−4 06 1

    , AT = ( 2 −4 65 0 1

    )

    I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example

    A =

    a11 a12a21 a22a31 a32

    , B = 12 50−3 10

    16 21

    If A = B, then a11 = 12, a12 = 50, and so on.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77

  • Matrix addition and subtraction

    I Adding or subtracting of two matrices require that they have thesame dimension.

    A =

    (1 3 52 4 6

    ), B =

    (2 5 13 6 7

    )

    A+B =

    (1 + 2 3 + 5 5 + 12 + 3 4 + 6 6 + 7

    )=

    (3 8 65 10 13

    )

    A−B =(

    1− 2 3− 5 5− 12− 3 4− 6 6− 7

    )=

    (−1 −2 4−1 −2 −1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 7 / 77

  • Matrix addition and subtraction

    I Adding or subtracting of two matrices require that they have thesame dimension.

    A =

    (1 3 52 4 6

    ), B =

    (2 5 13 6 7

    )

    A+B =

    (1 + 2 3 + 5 5 + 12 + 3 4 + 6 6 + 7

    )=

    (3 8 65 10 13

    )

    A−B =(

    1− 2 3− 5 5− 12− 3 4− 6 6− 7

    )=

    (−1 −2 4−1 −2 −1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 7 / 77

  • Matrix multiplication

    I Multiplication of a Matrix by a Scalar

    A =

    5 2 53 4 01 6 7

    ,4 ∗A = A ∗ 4 = 4

    5 2 53 4 01 6 7

    = 20 8 2012 16 0

    4 24 28

    I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:

    c∑k=1

    aikbkj

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77

  • Matrix multiplication

    I Multiplication of a Matrix by a Scalar

    A =

    5 2 53 4 01 6 7

    ,4 ∗A = A ∗ 4 = 4

    5 2 53 4 01 6 7

    = 20 8 2012 16 0

    4 24 28

    I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:

    c∑k=1

    aikbkj

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77

  • Matrix multiplication

    I Multiplication of a Matrix by a Scalar

    A =

    5 2 53 4 01 6 7

    ,4 ∗A = A ∗ 4 = 4

    5 2 53 4 01 6 7

    = 20 8 2012 16 0

    4 24 28

    I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:

    c∑k=1

    aikbkj

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77

  • Matrix multiplication Examples

    I Example 1

    (2 4 03 1 5

    ) 1 21 00 3

    = ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3

    )

    =

    (6 44 21

    )

    I Example 2 (5 32 6

    )(a1a2

    )=

    (5a1 + 3a22a1 + 6a2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77

  • Matrix multiplication Examples

    I Example 1

    (2 4 03 1 5

    ) 1 21 00 3

    = ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3

    )

    =

    (6 44 21

    )I Example 2 (

    5 32 6

    )(a1a2

    )=

    (5a1 + 3a22a1 + 6a2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77

  • Matrix multiplication Examples

    I Example 1

    (2 4 03 1 5

    ) 1 21 00 3

    = ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3

    )

    =

    (6 44 21

    )I Example 2 (

    5 32 6

    )(a1a2

    )=

    (5a1 + 3a22a1 + 6a2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77

  • Regression Examples

    I One can easily check that1 X11 X2...

    ...1 Xn

    (β0β1

    )=

    β0 + β1X1β0 + β1X2

    ...β0 + β1Xn

    I Now let

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77

  • Regression Examples

    I One can easily check that1 X11 X2...

    ...1 Xn

    (β0β1

    )=

    β0 + β1X1β0 + β1X2

    ...β0 + β1Xn

    I Now let

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77

  • Regression Examples

    I One can easily check that1 X11 X2...

    ...1 Xn

    (β0β1

    )=

    β0 + β1X1β0 + β1X2

    ...β0 + β1Xn

    I Now let

    Y =

    Y1Y2...Yn

    , X =

    1 X11 X2...

    ...1 Xn

    , β =(β0β1

    )and E =

    ε1ε2...εn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77

  • Regression Models

    The regression model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    can be written asY = Xβ + E

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77

  • Regression Models

    The regression model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    can be written asY = Xβ + E

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77

  • Regression Models

    The regression model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    can be written asY = Xβ + E

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77

  • Regression Models Important Calculations

    I Other Calculations

    XTX =

    (1 1 . . . 1X1 X2 . . . Xn

    )1 X11 X2...

    ...1 Xn

    =

    n

    n∑i=1

    Xi

    n∑i=1

    Xi

    n∑i=1

    X2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77

  • Regression Models Important Calculations

    I Other Calculations

    XTX =

    (1 1 . . . 1X1 X2 . . . Xn

    )1 X11 X2...

    ...1 Xn

    =

    n

    n∑i=1

    Xi

    n∑i=1

    Xi

    n∑i=1

    X2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77

  • Regression Models Important Calculations

    I Other Calculations

    XTX =

    (1 1 . . . 1X1 X2 . . . Xn

    )1 X11 X2...

    ...1 Xn

    =

    n

    n∑i=1

    Xi

    n∑i=1

    Xi

    n∑i=1

    X2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77

  • Regression Models Important Calculations (Cont..)

    XTY =

    (1 1 . . . 1X1 X2 . . . Xn

    )Y1Y2...Yn

    =

    n∑i=1

    Yi

    n∑i=1

    XiYi

    and

    Y TY =(Y1 Y2 . . . Yn

    )

    Y1Y2...Yn

    =n∑i=1

    Y 2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77

  • Regression Models Important Calculations (Cont..)

    XTY =

    (1 1 . . . 1X1 X2 . . . Xn

    )Y1Y2...Yn

    =

    n∑i=1

    Yi

    n∑i=1

    XiYi

    and

    Y TY =(Y1 Y2 . . . Yn

    )

    Y1Y2...Yn

    =n∑i=1

    Y 2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77

  • Regression Models Important Calculations (Cont..)

    XTY =

    (1 1 . . . 1X1 X2 . . . Xn

    )Y1Y2...Yn

    =

    n∑i=1

    Yi

    n∑i=1

    XiYi

    and

    Y TY =(Y1 Y2 . . . Yn

    )

    Y1Y2...Yn

    =n∑i=1

    Y 2i

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77

  • Special types of matrices

    I Symmetric Matrix: A = AT

    A =

    1 3 53 2 45 4 9

    I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.

    B =

    b11 0 00 b22 00 0 b33

    I Identity Matrix

    I =

    1 0 00 1 00 0 1

    facts: for any appropriate matrix AI = A and IB = B.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77

  • Special types of matrices

    I Symmetric Matrix: A = AT

    A =

    1 3 53 2 45 4 9

    I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.

    B =

    b11 0 00 b22 00 0 b33

    I Identity Matrix

    I =

    1 0 00 1 00 0 1

    facts: for any appropriate matrix AI = A and IB = B.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77

  • Special types of matrices

    I Symmetric Matrix: A = AT

    A =

    1 3 53 2 45 4 9

    I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.

    B =

    b11 0 00 b22 00 0 b33

    I Identity Matrix

    I =

    1 0 00 1 00 0 1

    facts: for any appropriate matrix AI = A and IB = B.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77

  • Special types of matrices

    I Symmetric Matrix: A = AT

    A =

    1 3 53 2 45 4 9

    I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.

    B =

    b11 0 00 b22 00 0 b33

    I Identity Matrix

    I =

    1 0 00 1 00 0 1

    facts: for any appropriate matrix AI = A and IB = B.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77

  • Special types of matrices

    I zero vector and unit vector:

    0 =

    00...0

    , 1 =

    11...1

    I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(

    −0.1 0.40.3 −0.2

    )(2 43 1

    )= I =

    (2 43 1

    )(−0.1 0.40.3 −0.2

    )So

    A−1 =

    (−0.1 0.40.3 −0.2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77

  • Special types of matrices

    I zero vector and unit vector:

    0 =

    00...0

    , 1 =

    11...1

    I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=I

    Since(−0.1 0.40.3 −0.2

    )(2 43 1

    )= I =

    (2 43 1

    )(−0.1 0.40.3 −0.2

    )So

    A−1 =

    (−0.1 0.40.3 −0.2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77

  • Special types of matrices

    I zero vector and unit vector:

    0 =

    00...0

    , 1 =

    11...1

    I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(

    −0.1 0.40.3 −0.2

    )(2 43 1

    )= I =

    (2 43 1

    )(−0.1 0.40.3 −0.2

    )

    So

    A−1 =

    (−0.1 0.40.3 −0.2

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77

  • Special types of matrices

    I zero vector and unit vector:

    0 =

    00...0

    , 1 =

    11...1

    I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(

    −0.1 0.40.3 −0.2

    )(2 43 1

    )= I =

    (2 43 1

    )(−0.1 0.40.3 −0.2

    )So

    A−1 =

    (−0.1 0.40.3 −0.2

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77

  • Special types of matrices

    I zero vector and unit vector:

    0 =

    00...0

    , 1 =

    11...1

    I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(

    −0.1 0.40.3 −0.2

    )(2 43 1

    )= I =

    (2 43 1

    )(−0.1 0.40.3 −0.2

    )So

    A−1 =

    (−0.1 0.40.3 −0.2

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77

  • Finding the Inverse of a matrix

    I For 2× 2 matrix, we can easily find its inverse; If

    A =

    (a bc d

    )then

    A−1 =

    (dD

    −bD

    −cD

    aD

    )where D = ad− bc

    I For high dimensional matrix, its inverse is not easy to calculate byhand

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77

  • Finding the Inverse of a matrix

    I For 2× 2 matrix, we can easily find its inverse; If

    A =

    (a bc d

    )then

    A−1 =

    (dD

    −bD

    −cD

    aD

    )where D = ad− bc

    I For high dimensional matrix, its inverse is not easy to calculate byhand

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77

  • Finding the Inverse of a matrix

    I For 2× 2 matrix, we can easily find its inverse; If

    A =

    (a bc d

    )then

    A−1 =

    (dD

    −bD

    −cD

    aD

    )where D = ad− bc

    I For high dimensional matrix, its inverse is not easy to calculate byhand

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77

  • Regression Example (continue)

    I the inverse of matrix

    XTX =

    n

    n∑i=1

    Xi

    n∑i=1

    Xi

    n∑i=1

    X2i

    D = n

    n∑i=1

    X2i − (n∑i=1

    Xi)2 = n

    n∑i=1

    X2i −(

    n∑i=1

    Xi)2

    n

    = n

    n∑i=1

    (Xi − X̄)2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 17 / 77

  • Regression Example (continue)

    So

    (XTX)−1

    =

    n∑i=1

    X2i

    n

    n∑i=1

    (Xi − X̄)2

    −n∑i=1

    Xi

    n

    n∑i=1

    (Xi − X̄)2

    −n∑i=1

    Xi

    n

    n∑i=1

    (Xi − X̄)2n

    n

    n∑i=1

    (Xi − X̄)2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 18 / 77

  • Use of Inverse Matrix

    I Suppose we want to solve two equations:

    2Y1 + 4Y2 = 20

    3Y1 + Y2 = 10

    Rewrite the equations in matrix notation:(2 43 1

    )(Y1Y2

    )=

    (2010

    )So the solution to the equations(

    Y1Y2

    )=

    (2 43 1

    )−1(2010

    )=

    (−0.1 0.40.3 −0.2

    )(2010

    )=

    (24

    )I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77

  • Use of Inverse Matrix

    I Suppose we want to solve two equations:

    2Y1 + 4Y2 = 20

    3Y1 + Y2 = 10

    Rewrite the equations in matrix notation:(2 43 1

    )(Y1Y2

    )=

    (2010

    )

    So the solution to the equations(Y1Y2

    )=

    (2 43 1

    )−1(2010

    )=

    (−0.1 0.40.3 −0.2

    )(2010

    )=

    (24

    )I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77

  • Use of Inverse Matrix

    I Suppose we want to solve two equations:

    2Y1 + 4Y2 = 20

    3Y1 + Y2 = 10

    Rewrite the equations in matrix notation:(2 43 1

    )(Y1Y2

    )=

    (2010

    )So the solution to the equations(

    Y1Y2

    )=

    (2 43 1

    )−1(2010

    )=

    (−0.1 0.40.3 −0.2

    )(2010

    )=

    (24

    )

    I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77

  • Use of Inverse Matrix

    I Suppose we want to solve two equations:

    2Y1 + 4Y2 = 20

    3Y1 + Y2 = 10

    Rewrite the equations in matrix notation:(2 43 1

    )(Y1Y2

    )=

    (2010

    )So the solution to the equations(

    Y1Y2

    )=

    (2 43 1

    )−1(2010

    )=

    (−0.1 0.40.3 −0.2

    )(2010

    )=

    (24

    )I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77

  • Use of Inverse Matrix

    I Suppose we want to solve two equations:

    2Y1 + 4Y2 = 20

    3Y1 + Y2 = 10

    Rewrite the equations in matrix notation:(2 43 1

    )(Y1Y2

    )=

    (2010

    )So the solution to the equations(

    Y1Y2

    )=

    (2 43 1

    )−1(2010

    )=

    (−0.1 0.40.3 −0.2

    )(2010

    )=

    (24

    )I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77

  • Other basic facts for matrices

    I A+B = B +A

    I C(A+B) = CA+CB

    I (AT )T

    = A

    I (AB)T = BTAT

    I (A−1)−1

    = A

    I (AB)−1 = B−1A−1

    I (AT )−1

    = (A−1)T

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 20 / 77

  • Other basic facts for matrices

    I A+B = B +A

    I C(A+B) = CA+CB

    I (AT )T

    = A

    I (AB)T = BTAT

    I (A−1)−1

    = A

    I (AB)−1 = B−1A−1

    I (AT )−1

    = (A−1)T

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 20 / 77

  • Random vector and matrices

    I Random vector

    Y =

    Y1Y2Y3

    I Expectation of random vector

    E(Y ) =

    E(Y1)E(Y2)E(Y3)

    I For any Random vectors

    Y =

    Y1Y2Y3

    , Z = Z1Z2

    Z3

    Then

    E(Y +Z) = E(Y ) + E(Z)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77

  • Random vector and matrices

    I Random vector

    Y =

    Y1Y2Y3

    I Expectation of random vector

    E(Y ) =

    E(Y1)E(Y2)E(Y3)

    I For any Random vectors

    Y =

    Y1Y2Y3

    , Z = Z1Z2

    Z3

    Then

    E(Y +Z) = E(Y ) + E(Z)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77

  • Random vector and matrices

    I Random vector

    Y =

    Y1Y2Y3

    I Expectation of random vector

    E(Y ) =

    E(Y1)E(Y2)E(Y3)

    I For any Random vectors

    Y =

    Y1Y2Y3

    , Z = Z1Z2

    Z3

    Then

    E(Y +Z) = E(Y ) + E(Z)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77

  • Random vector and matrices

    I Random vector

    Y =

    Y1Y2Y3

    I Expectation of random vector

    E(Y ) =

    E(Y1)E(Y2)E(Y3)

    I For any Random vectors

    Y =

    Y1Y2Y3

    , Z = Z1Z2

    Z3

    Then

    E(Y +Z) = E(Y ) + E(Z)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77

  • Random vector and matrices

    I Variance-covariance Matrix of random vector

    Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T

    ]=

    Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)

    I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.

    For example consider n = 3.

    Var(E) =

    σ2 0 00 σ2 00 0 σ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77

  • Random vector and matrices

    I Variance-covariance Matrix of random vector

    Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T

    ]=

    Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)

    I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.

    For example consider n = 3.

    Var(E) =

    σ2 0 00 σ2 00 0 σ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77

  • Random vector and matrices

    I Variance-covariance Matrix of random vector

    Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T

    ]=

    Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)

    I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.

    For example consider n = 3.

    Var(E) =

    σ2 0 00 σ2 00 0 σ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77

  • Some basic facts

    I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A

    W = AY

    we have

    E(W ) = AE(Y )

    Var(W ) = Var(AY ) = AVar(Y )AT

    I If c is a constant vector, then

    E(c+AY ) = c+AE(Y )

    andVar(c+AY ) = Var(AY ) = AVarY AT

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77

  • Some basic facts

    I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A

    W = AY

    we have

    E(W ) = AE(Y )

    Var(W ) = Var(AY ) = AVar(Y )AT

    I If c is a constant vector, then

    E(c+AY ) = c+AE(Y )

    andVar(c+AY ) = Var(AY ) = AVarY AT

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77

  • Some basic facts

    I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A

    W = AY

    we have

    E(W ) = AE(Y )

    Var(W ) = Var(AY ) = AVar(Y )AT

    I If c is a constant vector, then

    E(c+AY ) = c+AE(Y )

    andVar(c+AY ) = Var(AY ) = AVarY AT

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77

  • An illustration: Example

    I Let W = AY be such that(W1W2

    )=

    (1 −11 1

    )(Y1Y2

    )

    Then

    E

    [(W1W2

    )]=

    (1 −11 1

    )(E(Y1)E(Y2)

    )=

    (E(Y1)− E(Y2)E(Y1) + E(Y2)

    )I Also

    Var

    [(W1W2

    )]=

    (1 −11 1

    )(Var(Y1) Cov(Y1, Y2)

    Cov(Y2, Y1) Var(Y2)

    )(1 1−1 1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77

  • An illustration: Example

    I Let W = AY be such that(W1W2

    )=

    (1 −11 1

    )(Y1Y2

    )Then

    E

    [(W1W2

    )]=

    (1 −11 1

    )(E(Y1)E(Y2)

    )=

    (E(Y1)− E(Y2)E(Y1) + E(Y2)

    )

    I Also

    Var

    [(W1W2

    )]=

    (1 −11 1

    )(Var(Y1) Cov(Y1, Y2)

    Cov(Y2, Y1) Var(Y2)

    )(1 1−1 1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77

  • An illustration: Example

    I Let W = AY be such that(W1W2

    )=

    (1 −11 1

    )(Y1Y2

    )Then

    E

    [(W1W2

    )]=

    (1 −11 1

    )(E(Y1)E(Y2)

    )=

    (E(Y1)− E(Y2)E(Y1) + E(Y2)

    )I Also

    Var

    [(W1W2

    )]=

    (1 −11 1

    )(Var(Y1) Cov(Y1, Y2)

    Cov(Y2, Y1) Var(Y2)

    )(1 1−1 1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77

  • An illustration: Example

    I Let W = AY be such that(W1W2

    )=

    (1 −11 1

    )(Y1Y2

    )Then

    E

    [(W1W2

    )]=

    (1 −11 1

    )(E(Y1)E(Y2)

    )=

    (E(Y1)− E(Y2)E(Y1) + E(Y2)

    )I Also

    Var

    [(W1W2

    )]=

    (1 −11 1

    )(Var(Y1) Cov(Y1, Y2)

    Cov(Y2, Y1) Var(Y2)

    )(1 1−1 1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77

  • An illustration: Example 2

    I In simple linear regression model, Y = Xβ + E , it follows fromabove

    Var(Y ) = Var(Xβ + E)= Var(E)

    =

    σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...

    ......

    . . ....

    ...0 0 0 . . . σ2 00 0 0 . . . 0 σ2

    = σ2I

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 25 / 77

  • An illustration: Example 2

    I In simple linear regression model, Y = Xβ + E , it follows fromabove

    Var(Y ) = Var(Xβ + E)= Var(E)

    =

    σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...

    ......

    . . ....

    ...0 0 0 . . . σ2 00 0 0 . . . 0 σ2

    = σ2I

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 25 / 77

  • Simple linear regression model (matrix version)

    The model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    with assumption

    1 E(εi) = 0, i = 1, 2, . . . , n

    2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.

    3 εi ∼ N(0, σ2), i = 1, ..., n are independent

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77

  • Simple linear regression model (matrix version)

    The model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    with assumption

    1 E(εi) = 0, i = 1, 2, . . . , n

    2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.

    3 εi ∼ N(0, σ2), i = 1, ..., n are independent

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77

  • Simple linear regression model (matrix version)

    The model

    Y1 = β0 + β1X1 + ε1

    Y2 = β0 + β1X2 + ε2...

    ...

    Yn = β0 + β1Xn + εn

    with assumption

    1 E(εi) = 0, i = 1, 2, . . . , n

    2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.

    3 εi ∼ N(0, σ2), i = 1, ..., n are independent

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77

  • Simple linear regression model (matrix version)

    Recall, the model can be written as

    Y = Xβ + E

    Note that

    E(E) = 0, Var(E) =

    σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...

    ......

    . . ....

    ...0 0 0 . . . σ2 00 0 0 . . . 0 σ2

    = σ2I

    The assumptions can be rewritten as

    1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77

  • Simple linear regression model (matrix version)

    Recall, the model can be written as

    Y = Xβ + E

    Note that

    E(E) = 0, Var(E) =

    σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...

    ......

    . . ....

    ...0 0 0 . . . σ2 00 0 0 . . . 0 σ2

    = σ2I

    The assumptions can be rewritten as

    1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77

  • Simple linear regression model (matrix version)

    Recall, the model can be written as

    Y = Xβ + E

    Note that

    E(E) = 0, Var(E) =

    σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...

    ......

    . . ....

    ...0 0 0 . . . σ2 00 0 0 . . . 0 σ2

    = σ2I

    The assumptions can be rewritten as

    1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77

  • Simple linear regression model (matrix version)

    Thus the modelY = Xβ + E

    is such thatE(Y ) = Xβ and Var(Y ) = σ2I

    The model (with assumptions 1, 2, and 3.) can also be written as

    Y ∼ N(Xβ, σ2I)

    orY = Xβ + E , E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77

  • Simple linear regression model (matrix version)

    Thus the modelY = Xβ + E

    is such thatE(Y ) = Xβ and Var(Y ) = σ2I

    The model (with assumptions 1, 2, and 3.) can also be written as

    Y ∼ N(Xβ, σ2I)

    orY = Xβ + E , E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77

  • Simple linear regression model (matrix version)

    Thus the modelY = Xβ + E

    is such thatE(Y ) = Xβ and Var(Y ) = σ2I

    The model (with assumptions 1, 2, and 3.) can also be written as

    Y ∼ N(Xβ, σ2I)

    orY = Xβ + E , E ∼ N(0, σ2I)

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77

  • Linear Dependence and Rank of Matrix

    Consider the following matrix;

    A =

    1 2 5 12 2 10 63 4 15 1

    Note that the third column vector is a multiple of the first columnvector. 510

    15

    = 5 12

    3

    We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77

  • Linear Dependence and Rank of Matrix

    Consider the following matrix;

    A =

    1 2 5 12 2 10 63 4 15 1

    Note that the third column vector is a multiple of the first columnvector. 510

    15

    = 5 12

    3

    We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77

  • Linear Dependence and Rank of Matrix

    Consider the following matrix;

    A =

    1 2 5 12 2 10 63 4 15 1

    Note that the third column vector is a multiple of the first columnvector. 510

    15

    = 5 12

    3

    We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77

  • Linear Dependence and Rank of Matrix

    Consider the following matrix;

    A =

    1 2 5 12 2 10 63 4 15 1

    Note that the third column vector is a multiple of the first columnvector. 510

    15

    = 5 12

    3

    We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77

  • Linear Dependence and Rank of Matrix (Cont..)

    Definition

    When c scalars k1, ..., kc, not all zero, can be found such that:

    klC1 + k2C2 + ...+ kcCc = 0

    where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.

    To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:

    5

    123

    + 0 22

    4

    − 1 510

    15

    + 0 16

    1

    = 00

    0

    Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77

  • Linear Dependence and Rank of Matrix (Cont..)

    Definition

    When c scalars k1, ..., kc, not all zero, can be found such that:

    klC1 + k2C2 + ...+ kcCc = 0

    where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.

    To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:

    5

    123

    + 0 22

    4

    − 1 510

    15

    + 0 16

    1

    = 00

    0

    Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77

  • Linear Dependence and Rank of Matrix (Cont..)

    Definition

    When c scalars k1, ..., kc, not all zero, can be found such that:

    klC1 + k2C2 + ...+ kcCc = 0

    where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.

    To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:

    5

    123

    + 0 22

    4

    − 1 510

    15

    + 0 16

    1

    = 00

    0

    Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77

  • Linear Dependence and Rank of Matrix (Cont..)

    Definition

    When c scalars k1, ..., kc, not all zero, can be found such that:

    klC1 + k2C2 + ...+ kcCc = 0

    where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.

    To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:

    5

    123

    + 0 22

    4

    − 1 510

    15

    + 0 16

    1

    = 00

    0

    Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.

    We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.

    There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Rank of Matrix

    Definition (Rank of Matrix)

    The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.

    We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.

    The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77

  • Least Squares Estimation of Regression Parameters

    As we have shown, the normal equations are,

    n∑i=1

    ei = 0 ≡ nβ̂0 + β̂1n∑i=1

    Xi =

    n∑i=1

    Yi

    n∑i=1

    Xiei = 0 ≡ β̂0n∑i=1

    Xi + β̂1

    n∑i=1

    X2i =

    n∑i=1

    XiYi

    in matrix notation, the normal equations are:

    XTX︸ ︷︷ ︸2×2

    β̂︸︷︷︸2×1

    = XTY︸ ︷︷ ︸2×1

    (2)

    where β̂ is the vector of the least squares regression coefficients:

    β̂ =

    (β̂0β̂1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77

  • Least Squares Estimation of Regression Parameters

    As we have shown, the normal equations are,

    n∑i=1

    ei = 0 ≡ nβ̂0 + β̂1n∑i=1

    Xi =

    n∑i=1

    Yi

    n∑i=1

    Xiei = 0 ≡ β̂0n∑i=1

    Xi + β̂1

    n∑i=1

    X2i =

    n∑i=1

    XiYi

    in matrix notation, the normal equations are:

    XTX︸ ︷︷ ︸2×2

    β̂︸︷︷︸2×1

    = XTY︸ ︷︷ ︸2×1

    (2)

    where β̂ is the vector of the least squares regression coefficients:

    β̂ =

    (β̂0β̂1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77

  • Least Squares Estimation of Regression Parameters

    As we have shown, the normal equations are,

    n∑i=1

    ei = 0 ≡ nβ̂0 + β̂1n∑i=1

    Xi =

    n∑i=1

    Yi

    n∑i=1

    Xiei = 0 ≡ β̂0n∑i=1

    Xi + β̂1

    n∑i=1

    X2i =

    n∑i=1

    XiYi

    in matrix notation, the normal equations are:

    XTX︸ ︷︷ ︸2×2

    β̂︸︷︷︸2×1

    = XTY︸ ︷︷ ︸2×1

    (2)

    where β̂ is the vector of the least squares regression coefficients:

    β̂ =

    (β̂0β̂1

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77

  • Least Squares Estimation of Regression Parameters

    As we have shown, the normal equations are,

    n∑i=1

    ei = 0 ≡ nβ̂0 + β̂1n∑i=1

    Xi =

    n∑i=1

    Yi

    n∑i=1

    Xiei = 0 ≡ β̂0n∑i=1

    Xi + β̂1

    n∑i=1

    X2i =

    n∑i=1

    XiYi

    in matrix notation, the normal equations are:

    XTX︸ ︷︷ ︸2×2

    β̂︸︷︷︸2×1

    = XTY︸ ︷︷ ︸2×1

    (2)

    where β̂ is the vector of the least squares regression coefficients:

    β̂ =

    (β̂0β̂1

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77

  • Least Squares Estimation of Regression Parameters

    To see this, recall that we obtained

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    ), XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )

    Equation (2) thus states:(n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    )(β̂0β̂1

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×2 2×1 2×1

    or equivalently:(nβ̂0 + β̂1

    ∑ni=1Xi

    β̂0∑n

    i=1Xi + β̂1∑n

    i=1X2i

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×1 2×1

    These are precisely the normal equations we derived before.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77

  • Least Squares Estimation of Regression Parameters

    To see this, recall that we obtained

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    ), XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )Equation (2) thus states:(

    n∑n

    i=1Xi∑ni=1Xi

    ∑ni=1X

    2i

    )(β̂0β̂1

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×2 2×1 2×1

    or equivalently:(nβ̂0 + β̂1

    ∑ni=1Xi

    β̂0∑n

    i=1Xi + β̂1∑n

    i=1X2i

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×1 2×1

    These are precisely the normal equations we derived before.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77

  • Least Squares Estimation of Regression Parameters

    To see this, recall that we obtained

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    ), XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )Equation (2) thus states:(

    n∑n

    i=1Xi∑ni=1Xi

    ∑ni=1X

    2i

    )(β̂0β̂1

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×2 2×1 2×1

    or equivalently:(nβ̂0 + β̂1

    ∑ni=1Xi

    β̂0∑n

    i=1Xi + β̂1∑n

    i=1X2i

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×1 2×1

    These are precisely the normal equations we derived before.Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77

  • Least Squares Estimation of Regression Parameters

    To see this, recall that we obtained

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    ), XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )Equation (2) thus states:(

    n∑n

    i=1Xi∑ni=1Xi

    ∑ni=1X

    2i

    )(β̂0β̂1

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×2 2×1 2×1

    or equivalently:(nβ̂0 + β̂1

    ∑ni=1Xi

    β̂0∑n

    i=1Xi + β̂1∑n

    i=1X2i

    )=

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )2×1 2×1

    These are precisely the normal equations we derived before.Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi,

    and

    sxx =n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =

    n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi, and

    sxx =

    n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =

    n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi, and

    sxx =

    n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =

    n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =

    n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi, and

    sxx =

    n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =

    n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =

    n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi, and

    sxx =

    n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =

    n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =

    n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • Further, let X̄ = 1n∑n

    i=1Xi, Ȳ =1n

    ∑ni=1 Yi, and

    sxx =

    n∑i=1

    (Xi − X̄)2 =n∑i=1

    X2i − nX̄2

    =

    n∑i=1

    X2i −(∑n

    i=1Xi)2

    n

    sxy =

    n∑i=1

    (Xi − X̄)(Yi − Ȳ ) =n∑i=1

    XiYi − nX̄Ȳ

    =n∑i=1

    XiYi −(∑n

    i=1Xi) (∑n

    i=1 Yi)

    n.

    I sxx is the corrected sum of squares of the X-values.

    I sxy is the corrected sum of products of the X- and Y -values.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77

  • With this notation we can write

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    )=

    (n nX̄nX̄ sxx + nX̄

    2

    )

    XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )

    =

    (nȲ

    sxy + nX̄Ȳ

    )Hence

    (XTX)−1 =1

    nsxx

    (sxx + nX̄

    2 −nX̄−nX̄ n

    )

    =

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77

  • With this notation we can write

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    )=

    (n nX̄nX̄ sxx + nX̄

    2

    )XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )=

    (nȲ

    sxy + nX̄Ȳ

    )

    Hence

    (XTX)−1 =1

    nsxx

    (sxx + nX̄

    2 −nX̄−nX̄ n

    )

    =

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77

  • With this notation we can write

    XTX =

    (n

    ∑ni=1Xi∑n

    i=1Xi∑n

    i=1X2i

    )=

    (n nX̄nX̄ sxx + nX̄

    2

    )XTY =

    ( ∑ni=1 Yi∑n

    i=1XiYi

    )=

    (nȲ

    sxy + nX̄Ȳ

    )Hence

    (XTX)−1 =1

    nsxx

    (sxx + nX̄

    2 −nX̄−nX̄ n

    )

    =

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77

  • Then, assuming that rank(X) = 2 (i.e., that the Xi’s are not allequal), we get

    β̂ =

    (β̂0β̂1

    )= (XTX)−1XTY (3)

    =

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    ( nȲsxy + nX̄Ȳ)

    =

    Ȳ − sxysxx X̄sxysxx

    i.e.

    β̂0 = Ȳ − β̂1X̄ and β̂1 =sxysxx

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 36 / 77

  • Estimated Regression Coefficients: An example

    Example 1: In some study, consider the following information:

    Y =

    16510151322

    ; X =

    1 41 11 21 31 31 4

    Now, let us do the required calculations:

    XTX =

    (6 1717 55

    ); XTY =

    (81261

    )Finally,

    β̂ =

    (6 1717 55

    )−1(81261

    )=

    (0.4394.610

    )Hence, β̂0=0.439 and β̂1=4.610.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77

  • Estimated Regression Coefficients: An example

    Example 1: In some study, consider the following information:

    Y =

    16510151322

    ; X =

    1 41 11 21 31 31 4

    Now, let us do the required calculations:

    XTX =

    (6 1717 55

    ); XTY =

    (81261

    )Finally,

    β̂ =

    (6 1717 55

    )−1(81261

    )=

    (0.4394.610

    )Hence, β̂0=0.439 and β̂1=4.610.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77

  • Estimated Regression Coefficients: An example

    Example 1: In some study, consider the following information:

    Y =

    16510151322

    ; X =

    1 41 11 21 31 31 4

    Now, let us do the required calculations:

    XTX =

    (6 1717 55

    ); XTY =

    (81261

    )Finally,

    β̂ =

    (6 1717 55

    )−1(81261

    )=

    (0.4394.610

    )

    Hence, β̂0=0.439 and β̂1=4.610.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77

  • Estimated Regression Coefficients: An example

    Example 1: In some study, consider the following information:

    Y =

    16510151322

    ; X =

    1 41 11 21 31 31 4

    Now, let us do the required calculations:

    XTX =

    (6 1717 55

    ); XTY =

    (81261

    )Finally,

    β̂ =

    (6 1717 55

    )−1(81261

    )=

    (0.4394.610

    )Hence, β̂0=0.439 and β̂1=4.610.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77

  • Estimated Regression Coefficients: An example

    Example 1: In some study, consider the following information:

    Y =

    16510151322

    ; X =

    1 41 11 21 31 31 4

    Now, let us do the required calculations:

    XTX =

    (6 1717 55

    ); XTY =

    (81261

    )Finally,

    β̂ =

    (6 1717 55

    )−1(81261

    )=

    (0.4394.610

    )Hence, β̂0=0.439 and β̂1=4.610.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77

  • Estimated Regression Coefficients: An example

    Example 2: For the ozone data used before;

    Y =

    242237231201

    ; X =

    1 0.021 0.071 0.111 0.15

    give

    XTX =

    (4 .3500

    .3500 .0399

    ), XTY =

    (911

    76.99

    )and, then

    (XTX)−1 =

    (1.07547 −9.43396−9.43396 107.81671

    )Hence, the estimates of the regression coefficients are

    β̂ = (XTX)−1XTY =

    (253.434−293.531

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77

  • Estimated Regression Coefficients: An example

    Example 2: For the ozone data used before;

    Y =

    242237231201

    ; X =

    1 0.021 0.071 0.111 0.15

    give

    XTX =

    (4 .3500

    .3500 .0399

    ), XTY =

    (911

    76.99

    )and, then

    (XTX)−1 =

    (1.07547 −9.43396−9.43396 107.81671

    )Hence, the estimates of the regression coefficients are

    β̂ = (XTX)−1XTY =

    (253.434−293.531

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77

  • Estimated Regression Coefficients: An example

    Example 2: For the ozone data used before;

    Y =

    242237231201

    ; X =

    1 0.021 0.071 0.111 0.15

    give

    XTX =

    (4 .3500

    .3500 .0399

    ), XTY =

    (911

    76.99

    )and, then

    (XTX)−1 =

    (1.07547 −9.43396−9.43396 107.81671

    )Hence, the estimates of the regression coefficients are

    β̂ = (XTX)−1XTY =

    (253.434−293.531

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77

  • Estimated Regression Coefficients: An example

    Example 2: For the ozone data used before;

    Y =

    242237231201

    ; X =

    1 0.021 0.071 0.111 0.15

    give

    XTX =

    (4 .3500

    .3500 .0399

    ), XTY =

    (911

    76.99

    )and, then

    (XTX)−1 =

    (1.07547 −9.43396−9.43396 107.81671

    )Hence, the estimates of the regression coefficients are

    β̂ = (XTX)−1XTY =

    (253.434−293.531

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77

  • Estimated Regression Coefficients: An example

    Example 2: For the ozone data used before;

    Y =

    242237231201

    ; X =

    1 0.021 0.071 0.111 0.15

    give

    XTX =

    (4 .3500

    .3500 .0399

    ), XTY =

    (911

    76.99

    )and, then

    (XTX)−1 =

    (1.07547 −9.43396−9.43396 107.81671

    )Hence, the estimates of the regression coefficients are

    β̂ = (XTX)−1XTY =

    (253.434−293.531

    )Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).

    I Var(β̂) = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )

    = (XTX)−1XTXβ = (XTX)−1(XTX)β = β

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )

    = (XTX)−1XTXβ = (XTX)−1(XTX)β = β

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )

    = (XTX)−1XTXβ = (XTX)−1(XTX)β = β

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Var(β̂) = Var((XTX)−1XTY )

    = (XTX)−1XTVar(Y )((XTX)−1XT )T

    = (XTX)−1XT (σ2I)((XTX)−1XT )T

    = σ2(XTX)−1XT ((XTX)−1XT )T

    = σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Var(β̂) = Var((XTX)−1XTY )

    = (XTX)−1XTVar(Y )((XTX)−1XT )T

    = (XTX)−1XT (σ2I)((XTX)−1XT )T

    = σ2(XTX)−1XT ((XTX)−1XT )T

    = σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Var(β̂) = Var((XTX)−1XTY )

    = (XTX)−1XTVar(Y )((XTX)−1XT )T

    = (XTX)−1XT (σ2I)((XTX)−1XT )T

    = σ2(XTX)−1XT ((XTX)−1XT )T

    = σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Var(β̂) = Var((XTX)−1XTY )

    = (XTX)−1XTVar(Y )((XTX)−1XT )T

    = (XTX)−1XT (σ2I)((XTX)−1XT )T

    = σ2(XTX)−1XT ((XTX)−1XT )T

    = σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Var(β̂) = Var((XTX)−1XTY )

    = (XTX)−1XTVar(Y )((XTX)−1XT )T

    = (XTX)−1XT (σ2I)((XTX)−1XT )T

    = σ2(XTX)−1XT ((XTX)−1XT )T

    = σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Properties of β̂

    β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1

    Under normality β̂ ∼ N2(β, σ2(XTX)−1

    )

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77

  • Quiz

    Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)

    Solution:We have shown that:

    Var(β̂) = σ2(XTX)−1 = σ2

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Hence,

    Var(β̂0) = σ2

    (1

    n+X̄2

    sxx

    )Var(β̂1) = σ

    2 1

    sxx=

    σ2

    sxx

    Cov(β̂0, β̂1) = −X̄

    sxxσ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77

  • Quiz

    Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)

    Solution:We have shown that:

    Var(β̂) = σ2(XTX)−1 = σ2

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Hence,

    Var(β̂0) = σ2

    (1

    n+X̄2

    sxx

    )Var(β̂1) = σ

    2 1

    sxx=

    σ2

    sxx

    Cov(β̂0, β̂1) = −X̄

    sxxσ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77

  • Quiz

    Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)

    Solution:We have shown that:

    Var(β̂) = σ2(XTX)−1 = σ2

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Hence,

    Var(β̂0) = σ2

    (1

    n+X̄2

    sxx

    )Var(β̂1) = σ

    2 1

    sxx=

    σ2

    sxx

    Cov(β̂0, β̂1) = −X̄

    sxxσ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77

  • Quiz

    Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)

    Solution:We have shown that:

    Var(β̂) = σ2(XTX)−1 = σ2

    1

    n+X̄2

    sxx− X̄sxx

    − X̄sxx

    1

    sxx

    Hence,

    Var(β̂0) = σ2

    (1

    n+X̄2

    sxx

    )Var(β̂1) = σ

    2 1

    sxx=

    σ2

    sxx

    Cov(β̂0, β̂1) = −X̄

    sxxσ2

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77

  • The Ŷ and Residuals Vectors

    Let the vector of the fitted values Yi be denoted by Ŷ

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    In matrix notation, we then have:

    Ŷ = X β̂ (4)n×1 n×2 2×1

    because:

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    =

    1 X11 X2...

    ...1 Xn

    (β̂0β̂1

    )=

    β̂0 + β̂1X1β̂0 + β̂1X2

    ...

    β̂0 + β̂1Xn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77

  • The Ŷ and Residuals Vectors

    Let the vector of the fitted values Yi be denoted by Ŷ

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    In matrix notation, we then have:

    Ŷ = X β̂ (4)n×1 n×2 2×1

    because:

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    =

    1 X11 X2...

    ...1 Xn

    (β̂0β̂1

    )=

    β̂0 + β̂1X1β̂0 + β̂1X2

    ...

    β̂0 + β̂1Xn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77

  • The Ŷ and Residuals Vectors

    Let the vector of the fitted values Yi be denoted by Ŷ

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    In matrix notation, we then have:

    Ŷ = X β̂ (4)n×1 n×2 2×1

    because:

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    =

    1 X11 X2...

    ...1 Xn

    (β̂0β̂1

    )=

    β̂0 + β̂1X1β̂0 + β̂1X2

    ...

    β̂0 + β̂1Xn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77

  • The Ŷ and Residuals Vectors

    Let the vector of the fitted values Yi be denoted by Ŷ

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    In matrix notation, we then have:

    Ŷ = X β̂ (4)n×1 n×2 2×1

    because:

    Ŷ =

    Ŷ1Ŷ2...

    Ŷn

    =

    1 X11 X2...

    ...1 Xn

    (β̂0β̂1

    )=

    β̂0 + β̂1X1β̂0 + β̂1X2

    ...

    β̂0 + β̂1Xn

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !

    where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.

    We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix

    We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):

    Ŷ = Xβ̂ = X(XTX)−1XTY

    or, equivalently:

    Ŷ = H Y (5)n×1 n×n n×1

    i.e., H puts a hat on Y !where

    H = X(XTX

    )−1XT

    The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77

  • Hat Matrix: Example

    For the ozone data used in Example 2;

    H =

    1 0.021 0.071 0.111 0.15

    ( 1.0755 −9.4340−9.4340 107.8167)(

    1 1 1 10.02 0.07 0.11 0.15

    )

    =

    .741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159

    Thus, for example,

    Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77

  • Hat Matrix: Example

    For the ozone data used in Example 2;

    H =

    1 0.021 0.071 0.111 0.15

    ( 1.0755 −9.4340−9.4340 107.8167)(

    1 1 1 10.02 0.07 0.11 0.15

    )

    =

    .741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159

    Thus, for example,

    Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77

  • Hat Matrix: Example

    For the ozone data used in Example 2;

    H =

    1 0.021 0.071 0.111 0.15

    ( 1.0755 −9.4340−9.4340 107.8167)(

    1 1 1 10.02 0.07 0.11 0.15

    )

    =

    .741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159

    Thus, for example,

    Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77

  • Properties of the Hat matrix

    I H is symmetric and idempotent (the latter means H2 = H).

    I I −H is symmetric and idempotent.

    I HX = X

    I (I −H)X = 0

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77

  • Properties of the Hat matrix

    I H is symmetric and idempotent (the latter means H2 = H).

    I I −H is symmetric and idempotent.

    I HX = X

    I (I −H)X = 0

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77

  • Properties of the Hat matrix

    I H is symmetric and idempotent (the latter means H2 = H).

    I I −H is symmetric and idempotent.

    I HX = X

    I (I −H)X = 0

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77

  • Properties of the Hat matrix

    I H is symmetric and idempotent (the latter means H2 = H).

    I I −H is symmetric and idempotent.

    I HX = X

    I (I −H)X = 0

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77

  • Properties of the Hat matrix

    I H is symmetric and idempotent (the latter means H2 = H).

    I I −H is symmetric and idempotent.

    I HX = X

    I (I −H)X = 0

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77

  • Properties of the fitted vector

    Vector of fitted values: Ŷ = Xβ̂.

    I EŶ = XβI VarŶ = σ2H

    E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.

    The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives

    Var(Ŷ ) = X[Var(β̂)

    ]XT

    = X[σ2(XTX)−1

    ]XT

    = X(XTX)−1XTσ2

    = Hσ2

    The derivation using the second relationship gives

    Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.

    When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2

    ).

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77

  • Properties of the fitted vector

    Vector of fitted values: Ŷ = Xβ̂.I EŶ = Xβ

    I VarŶ = σ2H

    E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.

    The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives

    Var(Ŷ ) = X[Var(β̂)

    ]XT

    = X[σ2(XTX)−1

    ]XT

    = X(XTX)−1XTσ2

    = Hσ2

    The derivation using the second relationship gives

    Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.

    When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2

    ).

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77

  • Properties of the fitted vector

    Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H

    E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.

    The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives

    Var(Ŷ ) = X[Var(β̂)

    ]XT

    = X[σ2(XTX)−1

    ]XT

    = X(XTX)−1XTσ2

    = Hσ2

    The derivation using the second relationship gives

    Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.

    When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2

    ).

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77

  • Properties of the fitted vector

    Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H

    E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.

    The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives

    Var(Ŷ ) = X[Var(β̂)

    ]XT

    = X[σ2(XTX)−1

    ]XT

    = X(XTX)−1XTσ2

    = Hσ2

    The derivation using the second relationship gives

    Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.

    When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2

    ).

    Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77

  • Properties of the fitted vector

    Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H

    E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.

    The variance