site.iugaza.edu.pssite.iugaza.edu.ps/biqelan/files/2011/04/reglec31.pdf · overview the yield y iof...
TRANSCRIPT
-
Regression AnalysisChapter 3: Simple Linear Regression
Matrix Version
Dr. Bisher Mamoun [email protected]
Department of MathematicsThe Islamic University of Gaza
2010-2011, Semester 2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 1 / 77
-
Overview
The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that
Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)
where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2
for all i, and the Yi’s are uncorrelated.
The system of equations (1) can be written in matrix form as
Y = Xβ + E ,
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77
-
Overview
The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that
Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)
where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2
for all i, and the Yi’s are uncorrelated.The system of equations (1) can be written in matrix form as
Y = Xβ + E ,
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77
-
Overview
The yield Yi of a process in which amount Xi of a material is used, isrecorded on n different occasions. It is assumed that the mean yielddepends linearly on the amount of material, so that
Yi = β0 + β1Xi + εi, i = 1, 2, . . . , n, (1)
where β0 and β1 are unknown quantities, E εi = 0 and Var εi = σ2
for all i, and the Yi’s are uncorrelated.The system of equations (1) can be written in matrix form as
Y = Xβ + E ,
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 2 / 77
-
Review of Matrices
I A matrix: a rectangular array of elements arranged in rows andcolumns
I an example
Column Column1 2
Row 1Row 2Row 3
10 20100 2001000 2000
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77
-
Review of Matrices
I A matrix: a rectangular array of elements arranged in rows andcolumns
I an example
Column Column1 2
Row 1Row 2Row 3
10 20100 2001000 2000
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77
-
Review of Matrices
I A matrix: a rectangular array of elements arranged in rows andcolumns
I an example
Column Column1 2
Row 1Row 2Row 3
10 20100 2001000 2000
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 3 / 77
-
A matrix with r rows and c columns
I
A =
a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c
......
......
......
ai1 ai2 . . . aij . . . aic...
......
......
...ar1 ar2 . . . arj . . . arc
I Sometimes, in short notations we denote it as
A = [aij ] i = 1, ..., r; j = 1, ..., c
I r and c are called the dimension of a matrix
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77
-
A matrix with r rows and c columns
I
A =
a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c
......
......
......
ai1 ai2 . . . aij . . . aic...
......
......
...ar1 ar2 . . . arj . . . arc
I Sometimes, in short notations we denote it as
A = [aij ] i = 1, ..., r; j = 1, ..., c
I r and c are called the dimension of a matrix
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77
-
A matrix with r rows and c columns
I
A =
a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c
......
......
......
ai1 ai2 . . . aij . . . aic...
......
......
...ar1 ar2 . . . arj . . . arc
I Sometimes, in short notations we denote it as
A = [aij ] i = 1, ..., r; j = 1, ..., c
I r and c are called the dimension of a matrix
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77
-
A matrix with r rows and c columns
I
A =
a11 a12 . . . a1j . . . a1ca21 a22 . . . a2j . . . a2c
......
......
......
ai1 ai2 . . . aij . . . aic...
......
......
...ar1 ar2 . . . arj . . . arc
I Sometimes, in short notations we denote it as
A = [aij ] i = 1, ..., r; j = 1, ..., c
I r and c are called the dimension of a matrix
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 4 / 77
-
Square matrix and Vector
I Square matrix: equal number of rows and columns
(1 75 −2
),
a11 a12 a13a21 a22 a23a31 a32 a33
I Vector: matrix with only one row or one column
A =(
3 −2 5), B =
2610
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77
-
Square matrix and Vector
I Square matrix: equal number of rows and columns
(1 75 −2
),
a11 a12 a13a21 a22 a23a31 a32 a33
I Vector: matrix with only one row or one column
A =(
3 −2 5), B =
2610
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77
-
Square matrix and Vector
I Square matrix: equal number of rows and columns
(1 75 −2
),
a11 a12 a13a21 a22 a23a31 a32 a33
I Vector: matrix with only one row or one column
A =(
3 −2 5), B =
2610
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 5 / 77
-
Transpose of a matrix and equality of matrices
I Transpose of a matrix A is another matrix denoted by AT
A =
2 5−4 06 1
, AT = ( 2 −4 65 0 1
)
I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example
A =
a11 a12a21 a22a31 a32
, B = 12 50−3 10
16 21
If A = B, then a11 = 12, a12 = 50, and so on.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77
-
Transpose of a matrix and equality of matrices
I Transpose of a matrix A is another matrix denoted by AT
A =
2 5−4 06 1
, AT = ( 2 −4 65 0 1
)
I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example
A =
a11 a12a21 a22a31 a32
, B = 12 50−3 10
16 21
If A = B, then a11 = 12, a12 = 50, and so on.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77
-
Transpose of a matrix and equality of matrices
I Transpose of a matrix A is another matrix denoted by AT
A =
2 5−4 06 1
, AT = ( 2 −4 65 0 1
)
I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example
A =
a11 a12a21 a22a31 a32
, B = 12 50−3 10
16 21
If A = B, then a11 = 12, a12 = 50, and so on.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77
-
Transpose of a matrix and equality of matrices
I Transpose of a matrix A is another matrix denoted by AT
A =
2 5−4 06 1
, AT = ( 2 −4 65 0 1
)
I Two matrices are equal if they have the same dimension and all thecorresponding elements are equal. Suppose for example
A =
a11 a12a21 a22a31 a32
, B = 12 50−3 10
16 21
If A = B, then a11 = 12, a12 = 50, and so on.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 6 / 77
-
Matrix addition and subtraction
I Adding or subtracting of two matrices require that they have thesame dimension.
A =
(1 3 52 4 6
), B =
(2 5 13 6 7
)
A+B =
(1 + 2 3 + 5 5 + 12 + 3 4 + 6 6 + 7
)=
(3 8 65 10 13
)
A−B =(
1− 2 3− 5 5− 12− 3 4− 6 6− 7
)=
(−1 −2 4−1 −2 −1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 7 / 77
-
Matrix addition and subtraction
I Adding or subtracting of two matrices require that they have thesame dimension.
A =
(1 3 52 4 6
), B =
(2 5 13 6 7
)
A+B =
(1 + 2 3 + 5 5 + 12 + 3 4 + 6 6 + 7
)=
(3 8 65 10 13
)
A−B =(
1− 2 3− 5 5− 12− 3 4− 6 6− 7
)=
(−1 −2 4−1 −2 −1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 7 / 77
-
Matrix multiplication
I Multiplication of a Matrix by a Scalar
A =
5 2 53 4 01 6 7
,4 ∗A = A ∗ 4 = 4
5 2 53 4 01 6 7
= 20 8 2012 16 0
4 24 28
I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:
c∑k=1
aikbkj
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77
-
Matrix multiplication
I Multiplication of a Matrix by a Scalar
A =
5 2 53 4 01 6 7
,4 ∗A = A ∗ 4 = 4
5 2 53 4 01 6 7
= 20 8 2012 16 0
4 24 28
I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:
c∑k=1
aikbkj
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77
-
Matrix multiplication
I Multiplication of a Matrix by a Scalar
A =
5 2 53 4 01 6 7
,4 ∗A = A ∗ 4 = 4
5 2 53 4 01 6 7
= 20 8 2012 16 0
4 24 28
I Multiplication of a Matrix by a Matrix. If A has dimension r × cand B has dimension c× s, the product AB is a matrix of dimensionr × s with the element in the ith row and jth column:
c∑k=1
aikbkj
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 8 / 77
-
Matrix multiplication Examples
I Example 1
(2 4 03 1 5
) 1 21 00 3
= ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3
)
=
(6 44 21
)
I Example 2 (5 32 6
)(a1a2
)=
(5a1 + 3a22a1 + 6a2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77
-
Matrix multiplication Examples
I Example 1
(2 4 03 1 5
) 1 21 00 3
= ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3
)
=
(6 44 21
)I Example 2 (
5 32 6
)(a1a2
)=
(5a1 + 3a22a1 + 6a2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77
-
Matrix multiplication Examples
I Example 1
(2 4 03 1 5
) 1 21 00 3
= ( 2 ∗ 1 + 4 ∗ 1 + 0 ∗ 0 2 ∗ 2 + 4 ∗ 0 + 0 ∗ 33 ∗ 1 + 1 ∗ 1 + 5 ∗ 0 3 ∗ 2 + 1 ∗ 0 + 5 ∗ 3
)
=
(6 44 21
)I Example 2 (
5 32 6
)(a1a2
)=
(5a1 + 3a22a1 + 6a2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 9 / 77
-
Regression Examples
I One can easily check that1 X11 X2...
...1 Xn
(β0β1
)=
β0 + β1X1β0 + β1X2
...β0 + β1Xn
I Now let
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77
-
Regression Examples
I One can easily check that1 X11 X2...
...1 Xn
(β0β1
)=
β0 + β1X1β0 + β1X2
...β0 + β1Xn
I Now let
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77
-
Regression Examples
I One can easily check that1 X11 X2...
...1 Xn
(β0β1
)=
β0 + β1X1β0 + β1X2
...β0 + β1Xn
I Now let
Y =
Y1Y2...Yn
, X =
1 X11 X2...
...1 Xn
, β =(β0β1
)and E =
ε1ε2...εn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 10 / 77
-
Regression Models
The regression model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
can be written asY = Xβ + E
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77
-
Regression Models
The regression model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
can be written asY = Xβ + E
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77
-
Regression Models
The regression model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
can be written asY = Xβ + E
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 11 / 77
-
Regression Models Important Calculations
I Other Calculations
XTX =
(1 1 . . . 1X1 X2 . . . Xn
)1 X11 X2...
...1 Xn
=
n
n∑i=1
Xi
n∑i=1
Xi
n∑i=1
X2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77
-
Regression Models Important Calculations
I Other Calculations
XTX =
(1 1 . . . 1X1 X2 . . . Xn
)1 X11 X2...
...1 Xn
=
n
n∑i=1
Xi
n∑i=1
Xi
n∑i=1
X2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77
-
Regression Models Important Calculations
I Other Calculations
XTX =
(1 1 . . . 1X1 X2 . . . Xn
)1 X11 X2...
...1 Xn
=
n
n∑i=1
Xi
n∑i=1
Xi
n∑i=1
X2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 12 / 77
-
Regression Models Important Calculations (Cont..)
XTY =
(1 1 . . . 1X1 X2 . . . Xn
)Y1Y2...Yn
=
n∑i=1
Yi
n∑i=1
XiYi
and
Y TY =(Y1 Y2 . . . Yn
)
Y1Y2...Yn
=n∑i=1
Y 2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77
-
Regression Models Important Calculations (Cont..)
XTY =
(1 1 . . . 1X1 X2 . . . Xn
)Y1Y2...Yn
=
n∑i=1
Yi
n∑i=1
XiYi
and
Y TY =(Y1 Y2 . . . Yn
)
Y1Y2...Yn
=n∑i=1
Y 2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77
-
Regression Models Important Calculations (Cont..)
XTY =
(1 1 . . . 1X1 X2 . . . Xn
)Y1Y2...Yn
=
n∑i=1
Yi
n∑i=1
XiYi
and
Y TY =(Y1 Y2 . . . Yn
)
Y1Y2...Yn
=n∑i=1
Y 2i
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 13 / 77
-
Special types of matrices
I Symmetric Matrix: A = AT
A =
1 3 53 2 45 4 9
I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.
B =
b11 0 00 b22 00 0 b33
I Identity Matrix
I =
1 0 00 1 00 0 1
facts: for any appropriate matrix AI = A and IB = B.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77
-
Special types of matrices
I Symmetric Matrix: A = AT
A =
1 3 53 2 45 4 9
I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.
B =
b11 0 00 b22 00 0 b33
I Identity Matrix
I =
1 0 00 1 00 0 1
facts: for any appropriate matrix AI = A and IB = B.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77
-
Special types of matrices
I Symmetric Matrix: A = AT
A =
1 3 53 2 45 4 9
I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.
B =
b11 0 00 b22 00 0 b33
I Identity Matrix
I =
1 0 00 1 00 0 1
facts: for any appropriate matrix AI = A and IB = B.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77
-
Special types of matrices
I Symmetric Matrix: A = AT
A =
1 3 53 2 45 4 9
I Diagonal Matrix: a square zero matrix whose diagonal elementsare not all zeros.
B =
b11 0 00 b22 00 0 b33
I Identity Matrix
I =
1 0 00 1 00 0 1
facts: for any appropriate matrix AI = A and IB = B.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 14 / 77
-
Special types of matrices
I zero vector and unit vector:
0 =
00...0
, 1 =
11...1
I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(
−0.1 0.40.3 −0.2
)(2 43 1
)= I =
(2 43 1
)(−0.1 0.40.3 −0.2
)So
A−1 =
(−0.1 0.40.3 −0.2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77
-
Special types of matrices
I zero vector and unit vector:
0 =
00...0
, 1 =
11...1
I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=I
Since(−0.1 0.40.3 −0.2
)(2 43 1
)= I =
(2 43 1
)(−0.1 0.40.3 −0.2
)So
A−1 =
(−0.1 0.40.3 −0.2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77
-
Special types of matrices
I zero vector and unit vector:
0 =
00...0
, 1 =
11...1
I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(
−0.1 0.40.3 −0.2
)(2 43 1
)= I =
(2 43 1
)(−0.1 0.40.3 −0.2
)
So
A−1 =
(−0.1 0.40.3 −0.2
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77
-
Special types of matrices
I zero vector and unit vector:
0 =
00...0
, 1 =
11...1
I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(
−0.1 0.40.3 −0.2
)(2 43 1
)= I =
(2 43 1
)(−0.1 0.40.3 −0.2
)So
A−1 =
(−0.1 0.40.3 −0.2
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77
-
Special types of matrices
I zero vector and unit vector:
0 =
00...0
, 1 =
11...1
I Inverse of a square matrix: the inverse of a square matrix A isanother square matrix, denoted by A−1, such that AA−1=A−1A=ISince(
−0.1 0.40.3 −0.2
)(2 43 1
)= I =
(2 43 1
)(−0.1 0.40.3 −0.2
)So
A−1 =
(−0.1 0.40.3 −0.2
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 15 / 77
-
Finding the Inverse of a matrix
I For 2× 2 matrix, we can easily find its inverse; If
A =
(a bc d
)then
A−1 =
(dD
−bD
−cD
aD
)where D = ad− bc
I For high dimensional matrix, its inverse is not easy to calculate byhand
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77
-
Finding the Inverse of a matrix
I For 2× 2 matrix, we can easily find its inverse; If
A =
(a bc d
)then
A−1 =
(dD
−bD
−cD
aD
)where D = ad− bc
I For high dimensional matrix, its inverse is not easy to calculate byhand
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77
-
Finding the Inverse of a matrix
I For 2× 2 matrix, we can easily find its inverse; If
A =
(a bc d
)then
A−1 =
(dD
−bD
−cD
aD
)where D = ad− bc
I For high dimensional matrix, its inverse is not easy to calculate byhand
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 16 / 77
-
Regression Example (continue)
I the inverse of matrix
XTX =
n
n∑i=1
Xi
n∑i=1
Xi
n∑i=1
X2i
D = n
n∑i=1
X2i − (n∑i=1
Xi)2 = n
n∑i=1
X2i −(
n∑i=1
Xi)2
n
= n
n∑i=1
(Xi − X̄)2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 17 / 77
-
Regression Example (continue)
So
(XTX)−1
=
n∑i=1
X2i
n
n∑i=1
(Xi − X̄)2
−n∑i=1
Xi
n
n∑i=1
(Xi − X̄)2
−n∑i=1
Xi
n
n∑i=1
(Xi − X̄)2n
n
n∑i=1
(Xi − X̄)2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 18 / 77
-
Use of Inverse Matrix
I Suppose we want to solve two equations:
2Y1 + 4Y2 = 20
3Y1 + Y2 = 10
Rewrite the equations in matrix notation:(2 43 1
)(Y1Y2
)=
(2010
)So the solution to the equations(
Y1Y2
)=
(2 43 1
)−1(2010
)=
(−0.1 0.40.3 −0.2
)(2010
)=
(24
)I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77
-
Use of Inverse Matrix
I Suppose we want to solve two equations:
2Y1 + 4Y2 = 20
3Y1 + Y2 = 10
Rewrite the equations in matrix notation:(2 43 1
)(Y1Y2
)=
(2010
)
So the solution to the equations(Y1Y2
)=
(2 43 1
)−1(2010
)=
(−0.1 0.40.3 −0.2
)(2010
)=
(24
)I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77
-
Use of Inverse Matrix
I Suppose we want to solve two equations:
2Y1 + 4Y2 = 20
3Y1 + Y2 = 10
Rewrite the equations in matrix notation:(2 43 1
)(Y1Y2
)=
(2010
)So the solution to the equations(
Y1Y2
)=
(2 43 1
)−1(2010
)=
(−0.1 0.40.3 −0.2
)(2010
)=
(24
)
I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77
-
Use of Inverse Matrix
I Suppose we want to solve two equations:
2Y1 + 4Y2 = 20
3Y1 + Y2 = 10
Rewrite the equations in matrix notation:(2 43 1
)(Y1Y2
)=
(2010
)So the solution to the equations(
Y1Y2
)=
(2 43 1
)−1(2010
)=
(−0.1 0.40.3 −0.2
)(2010
)=
(24
)I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77
-
Use of Inverse Matrix
I Suppose we want to solve two equations:
2Y1 + 4Y2 = 20
3Y1 + Y2 = 10
Rewrite the equations in matrix notation:(2 43 1
)(Y1Y2
)=
(2010
)So the solution to the equations(
Y1Y2
)=
(2 43 1
)−1(2010
)=
(−0.1 0.40.3 −0.2
)(2010
)=
(24
)I Estimation a regression model need to solve linear equations, andinverse matrix is very useful.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 19 / 77
-
Other basic facts for matrices
I A+B = B +A
I C(A+B) = CA+CB
I (AT )T
= A
I (AB)T = BTAT
I (A−1)−1
= A
I (AB)−1 = B−1A−1
I (AT )−1
= (A−1)T
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 20 / 77
-
Other basic facts for matrices
I A+B = B +A
I C(A+B) = CA+CB
I (AT )T
= A
I (AB)T = BTAT
I (A−1)−1
= A
I (AB)−1 = B−1A−1
I (AT )−1
= (A−1)T
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 20 / 77
-
Random vector and matrices
I Random vector
Y =
Y1Y2Y3
I Expectation of random vector
E(Y ) =
E(Y1)E(Y2)E(Y3)
I For any Random vectors
Y =
Y1Y2Y3
, Z = Z1Z2
Z3
Then
E(Y +Z) = E(Y ) + E(Z)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77
-
Random vector and matrices
I Random vector
Y =
Y1Y2Y3
I Expectation of random vector
E(Y ) =
E(Y1)E(Y2)E(Y3)
I For any Random vectors
Y =
Y1Y2Y3
, Z = Z1Z2
Z3
Then
E(Y +Z) = E(Y ) + E(Z)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77
-
Random vector and matrices
I Random vector
Y =
Y1Y2Y3
I Expectation of random vector
E(Y ) =
E(Y1)E(Y2)E(Y3)
I For any Random vectors
Y =
Y1Y2Y3
, Z = Z1Z2
Z3
Then
E(Y +Z) = E(Y ) + E(Z)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77
-
Random vector and matrices
I Random vector
Y =
Y1Y2Y3
I Expectation of random vector
E(Y ) =
E(Y1)E(Y2)E(Y3)
I For any Random vectors
Y =
Y1Y2Y3
, Z = Z1Z2
Z3
Then
E(Y +Z) = E(Y ) + E(Z)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 21 / 77
-
Random vector and matrices
I Variance-covariance Matrix of random vector
Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T
]=
Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)
I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.
For example consider n = 3.
Var(E) =
σ2 0 00 σ2 00 0 σ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77
-
Random vector and matrices
I Variance-covariance Matrix of random vector
Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T
]=
Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)
I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.
For example consider n = 3.
Var(E) =
σ2 0 00 σ2 00 0 σ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77
-
Random vector and matrices
I Variance-covariance Matrix of random vector
Var(Y ) = E[(Y − E(Y )) (Y − E(Y ))T
]=
Var(Y1) Cov(Y1, Y2) Cov(Y1, Y3)Cov(Y2, Y1) Var(Y2) Cov(Y2, Y3)Cov(Y3, Y1) Cov(Y3, Y2) Var(Y3)
I In simple linear regression model, errors are uncorrelated, soVar(E) = σ2I.
For example consider n = 3.
Var(E) =
σ2 0 00 σ2 00 0 σ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 22 / 77
-
Some basic facts
I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A
W = AY
we have
E(W ) = AE(Y )
Var(W ) = Var(AY ) = AVar(Y )AT
I If c is a constant vector, then
E(c+AY ) = c+AE(Y )
andVar(c+AY ) = Var(AY ) = AVarY AT
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77
-
Some basic facts
I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A
W = AY
we have
E(W ) = AE(Y )
Var(W ) = Var(AY ) = AVar(Y )AT
I If c is a constant vector, then
E(c+AY ) = c+AE(Y )
andVar(c+AY ) = Var(AY ) = AVarY AT
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77
-
Some basic facts
I If we have a random vector W equal to a random vectorY multiplied by a constant matrix A
W = AY
we have
E(W ) = AE(Y )
Var(W ) = Var(AY ) = AVar(Y )AT
I If c is a constant vector, then
E(c+AY ) = c+AE(Y )
andVar(c+AY ) = Var(AY ) = AVarY AT
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 23 / 77
-
An illustration: Example
I Let W = AY be such that(W1W2
)=
(1 −11 1
)(Y1Y2
)
Then
E
[(W1W2
)]=
(1 −11 1
)(E(Y1)E(Y2)
)=
(E(Y1)− E(Y2)E(Y1) + E(Y2)
)I Also
Var
[(W1W2
)]=
(1 −11 1
)(Var(Y1) Cov(Y1, Y2)
Cov(Y2, Y1) Var(Y2)
)(1 1−1 1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77
-
An illustration: Example
I Let W = AY be such that(W1W2
)=
(1 −11 1
)(Y1Y2
)Then
E
[(W1W2
)]=
(1 −11 1
)(E(Y1)E(Y2)
)=
(E(Y1)− E(Y2)E(Y1) + E(Y2)
)
I Also
Var
[(W1W2
)]=
(1 −11 1
)(Var(Y1) Cov(Y1, Y2)
Cov(Y2, Y1) Var(Y2)
)(1 1−1 1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77
-
An illustration: Example
I Let W = AY be such that(W1W2
)=
(1 −11 1
)(Y1Y2
)Then
E
[(W1W2
)]=
(1 −11 1
)(E(Y1)E(Y2)
)=
(E(Y1)− E(Y2)E(Y1) + E(Y2)
)I Also
Var
[(W1W2
)]=
(1 −11 1
)(Var(Y1) Cov(Y1, Y2)
Cov(Y2, Y1) Var(Y2)
)(1 1−1 1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77
-
An illustration: Example
I Let W = AY be such that(W1W2
)=
(1 −11 1
)(Y1Y2
)Then
E
[(W1W2
)]=
(1 −11 1
)(E(Y1)E(Y2)
)=
(E(Y1)− E(Y2)E(Y1) + E(Y2)
)I Also
Var
[(W1W2
)]=
(1 −11 1
)(Var(Y1) Cov(Y1, Y2)
Cov(Y2, Y1) Var(Y2)
)(1 1−1 1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 24 / 77
-
An illustration: Example 2
I In simple linear regression model, Y = Xβ + E , it follows fromabove
Var(Y ) = Var(Xβ + E)= Var(E)
=
σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...
......
. . ....
...0 0 0 . . . σ2 00 0 0 . . . 0 σ2
= σ2I
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 25 / 77
-
An illustration: Example 2
I In simple linear regression model, Y = Xβ + E , it follows fromabove
Var(Y ) = Var(Xβ + E)= Var(E)
=
σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...
......
. . ....
...0 0 0 . . . σ2 00 0 0 . . . 0 σ2
= σ2I
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 25 / 77
-
Simple linear regression model (matrix version)
The model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
with assumption
1 E(εi) = 0, i = 1, 2, . . . , n
2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.
3 εi ∼ N(0, σ2), i = 1, ..., n are independent
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77
-
Simple linear regression model (matrix version)
The model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
with assumption
1 E(εi) = 0, i = 1, 2, . . . , n
2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.
3 εi ∼ N(0, σ2), i = 1, ..., n are independent
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77
-
Simple linear regression model (matrix version)
The model
Y1 = β0 + β1X1 + ε1
Y2 = β0 + β1X2 + ε2...
...
Yn = β0 + β1Xn + εn
with assumption
1 E(εi) = 0, i = 1, 2, . . . , n
2 Var(εi) = σ2, Cov(εi, εj) = 0 for all 1 ≤ i 6= j ≤ n.
3 εi ∼ N(0, σ2), i = 1, ..., n are independent
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 26 / 77
-
Simple linear regression model (matrix version)
Recall, the model can be written as
Y = Xβ + E
Note that
E(E) = 0, Var(E) =
σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...
......
. . ....
...0 0 0 . . . σ2 00 0 0 . . . 0 σ2
= σ2I
The assumptions can be rewritten as
1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77
-
Simple linear regression model (matrix version)
Recall, the model can be written as
Y = Xβ + E
Note that
E(E) = 0, Var(E) =
σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...
......
. . ....
...0 0 0 . . . σ2 00 0 0 . . . 0 σ2
= σ2I
The assumptions can be rewritten as
1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77
-
Simple linear regression model (matrix version)
Recall, the model can be written as
Y = Xβ + E
Note that
E(E) = 0, Var(E) =
σ2 0 0 . . . 0 00 σ2 0 . . . 0 00 0 σ2 . . . 0 0...
......
. . ....
...0 0 0 . . . σ2 00 0 0 . . . 0 σ2
= σ2I
The assumptions can be rewritten as
1 E(E) = 02 Var(E) = σ2I3 E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 27 / 77
-
Simple linear regression model (matrix version)
Thus the modelY = Xβ + E
is such thatE(Y ) = Xβ and Var(Y ) = σ2I
The model (with assumptions 1, 2, and 3.) can also be written as
Y ∼ N(Xβ, σ2I)
orY = Xβ + E , E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77
-
Simple linear regression model (matrix version)
Thus the modelY = Xβ + E
is such thatE(Y ) = Xβ and Var(Y ) = σ2I
The model (with assumptions 1, 2, and 3.) can also be written as
Y ∼ N(Xβ, σ2I)
orY = Xβ + E , E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77
-
Simple linear regression model (matrix version)
Thus the modelY = Xβ + E
is such thatE(Y ) = Xβ and Var(Y ) = σ2I
The model (with assumptions 1, 2, and 3.) can also be written as
Y ∼ N(Xβ, σ2I)
orY = Xβ + E , E ∼ N(0, σ2I)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 28 / 77
-
Linear Dependence and Rank of Matrix
Consider the following matrix;
A =
1 2 5 12 2 10 63 4 15 1
Note that the third column vector is a multiple of the first columnvector. 510
15
= 5 12
3
We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77
-
Linear Dependence and Rank of Matrix
Consider the following matrix;
A =
1 2 5 12 2 10 63 4 15 1
Note that the third column vector is a multiple of the first columnvector. 510
15
= 5 12
3
We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77
-
Linear Dependence and Rank of Matrix
Consider the following matrix;
A =
1 2 5 12 2 10 63 4 15 1
Note that the third column vector is a multiple of the first columnvector. 510
15
= 5 12
3
We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77
-
Linear Dependence and Rank of Matrix
Consider the following matrix;
A =
1 2 5 12 2 10 63 4 15 1
Note that the third column vector is a multiple of the first columnvector. 510
15
= 5 12
3
We say that the columns of A are linearly dependent. If no vector inthe set can be so expressed, we define the set of vectors to be linearlyindependent.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 29 / 77
-
Linear Dependence and Rank of Matrix (Cont..)
Definition
When c scalars k1, ..., kc, not all zero, can be found such that:
klC1 + k2C2 + ...+ kcCc = 0
where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.
To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:
5
123
+ 0 22
4
− 1 510
15
+ 0 16
1
= 00
0
Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77
-
Linear Dependence and Rank of Matrix (Cont..)
Definition
When c scalars k1, ..., kc, not all zero, can be found such that:
klC1 + k2C2 + ...+ kcCc = 0
where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.
To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:
5
123
+ 0 22
4
− 1 510
15
+ 0 16
1
= 00
0
Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77
-
Linear Dependence and Rank of Matrix (Cont..)
Definition
When c scalars k1, ..., kc, not all zero, can be found such that:
klC1 + k2C2 + ...+ kcCc = 0
where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.
To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:
5
123
+ 0 22
4
− 1 510
15
+ 0 16
1
= 00
0
Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77
-
Linear Dependence and Rank of Matrix (Cont..)
Definition
When c scalars k1, ..., kc, not all zero, can be found such that:
klC1 + k2C2 + ...+ kcCc = 0
where 0 denotes the zero column vector, the c column vectors arelinearly dependent. If the only set of scalars for which the equalityholds is k1 = 0, ..., kc = 0, the set of c column vectors is linearlyindependent.
To illustrate for our example, k1 = 5, k2 = 0, k3 = −1, k4 = 0 leads to:
5
123
+ 0 22
4
− 1 510
15
+ 0 16
1
= 00
0
Hence, the column vectors are linearly dependent. Note that some ofthe kj equal zero here. For linear dependence, it is only required thatnot all kj be zero.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 30 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.
We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.
There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Rank of Matrix
Definition (Rank of Matrix)
The rank of a matrix is defined to be the maximum number oflinearly independent columns in the matrix.
We know that the rank of A in our earlier example cannot be 4,since the four columns are linearly dependent.We can, however, find three columns (1, 2, and 4) which are linearlyindependent.There are no scalars k1, k2, k4 such that k1C1 + k2C2 + k4C4 = 0other than k1 = k2 = k4 = 0. Thus, the rank of Ain our example is 3.
The rank of a matrix is unique and can equivalently be defined as themaximum number of linearly independent rows. It follows that therank of an r × c matrix can not exceed min(r, c), the minimum of thetwo values r and c.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 31 / 77
-
Least Squares Estimation of Regression Parameters
As we have shown, the normal equations are,
n∑i=1
ei = 0 ≡ nβ̂0 + β̂1n∑i=1
Xi =
n∑i=1
Yi
n∑i=1
Xiei = 0 ≡ β̂0n∑i=1
Xi + β̂1
n∑i=1
X2i =
n∑i=1
XiYi
in matrix notation, the normal equations are:
XTX︸ ︷︷ ︸2×2
β̂︸︷︷︸2×1
= XTY︸ ︷︷ ︸2×1
(2)
where β̂ is the vector of the least squares regression coefficients:
β̂ =
(β̂0β̂1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77
-
Least Squares Estimation of Regression Parameters
As we have shown, the normal equations are,
n∑i=1
ei = 0 ≡ nβ̂0 + β̂1n∑i=1
Xi =
n∑i=1
Yi
n∑i=1
Xiei = 0 ≡ β̂0n∑i=1
Xi + β̂1
n∑i=1
X2i =
n∑i=1
XiYi
in matrix notation, the normal equations are:
XTX︸ ︷︷ ︸2×2
β̂︸︷︷︸2×1
= XTY︸ ︷︷ ︸2×1
(2)
where β̂ is the vector of the least squares regression coefficients:
β̂ =
(β̂0β̂1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77
-
Least Squares Estimation of Regression Parameters
As we have shown, the normal equations are,
n∑i=1
ei = 0 ≡ nβ̂0 + β̂1n∑i=1
Xi =
n∑i=1
Yi
n∑i=1
Xiei = 0 ≡ β̂0n∑i=1
Xi + β̂1
n∑i=1
X2i =
n∑i=1
XiYi
in matrix notation, the normal equations are:
XTX︸ ︷︷ ︸2×2
β̂︸︷︷︸2×1
= XTY︸ ︷︷ ︸2×1
(2)
where β̂ is the vector of the least squares regression coefficients:
β̂ =
(β̂0β̂1
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77
-
Least Squares Estimation of Regression Parameters
As we have shown, the normal equations are,
n∑i=1
ei = 0 ≡ nβ̂0 + β̂1n∑i=1
Xi =
n∑i=1
Yi
n∑i=1
Xiei = 0 ≡ β̂0n∑i=1
Xi + β̂1
n∑i=1
X2i =
n∑i=1
XiYi
in matrix notation, the normal equations are:
XTX︸ ︷︷ ︸2×2
β̂︸︷︷︸2×1
= XTY︸ ︷︷ ︸2×1
(2)
where β̂ is the vector of the least squares regression coefficients:
β̂ =
(β̂0β̂1
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 32 / 77
-
Least Squares Estimation of Regression Parameters
To see this, recall that we obtained
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
), XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)
Equation (2) thus states:(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
)(β̂0β̂1
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×2 2×1 2×1
or equivalently:(nβ̂0 + β̂1
∑ni=1Xi
β̂0∑n
i=1Xi + β̂1∑n
i=1X2i
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×1 2×1
These are precisely the normal equations we derived before.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77
-
Least Squares Estimation of Regression Parameters
To see this, recall that we obtained
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
), XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)Equation (2) thus states:(
n∑n
i=1Xi∑ni=1Xi
∑ni=1X
2i
)(β̂0β̂1
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×2 2×1 2×1
or equivalently:(nβ̂0 + β̂1
∑ni=1Xi
β̂0∑n
i=1Xi + β̂1∑n
i=1X2i
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×1 2×1
These are precisely the normal equations we derived before.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77
-
Least Squares Estimation of Regression Parameters
To see this, recall that we obtained
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
), XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)Equation (2) thus states:(
n∑n
i=1Xi∑ni=1Xi
∑ni=1X
2i
)(β̂0β̂1
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×2 2×1 2×1
or equivalently:(nβ̂0 + β̂1
∑ni=1Xi
β̂0∑n
i=1Xi + β̂1∑n
i=1X2i
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×1 2×1
These are precisely the normal equations we derived before.Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77
-
Least Squares Estimation of Regression Parameters
To see this, recall that we obtained
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
), XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)Equation (2) thus states:(
n∑n
i=1Xi∑ni=1Xi
∑ni=1X
2i
)(β̂0β̂1
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×2 2×1 2×1
or equivalently:(nβ̂0 + β̂1
∑ni=1Xi
β̂0∑n
i=1Xi + β̂1∑n
i=1X2i
)=
( ∑ni=1 Yi∑n
i=1XiYi
)2×1 2×1
These are precisely the normal equations we derived before.Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 33 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi,
and
sxx =n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=
n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi, and
sxx =
n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=
n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi, and
sxx =
n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=
n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =
n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi, and
sxx =
n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=
n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =
n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi, and
sxx =
n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=
n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =
n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
Further, let X̄ = 1n∑n
i=1Xi, Ȳ =1n
∑ni=1 Yi, and
sxx =
n∑i=1
(Xi − X̄)2 =n∑i=1
X2i − nX̄2
=
n∑i=1
X2i −(∑n
i=1Xi)2
n
sxy =
n∑i=1
(Xi − X̄)(Yi − Ȳ ) =n∑i=1
XiYi − nX̄Ȳ
=n∑i=1
XiYi −(∑n
i=1Xi) (∑n
i=1 Yi)
n.
I sxx is the corrected sum of squares of the X-values.
I sxy is the corrected sum of products of the X- and Y -values.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 34 / 77
-
With this notation we can write
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
)=
(n nX̄nX̄ sxx + nX̄
2
)
XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)
=
(nȲ
sxy + nX̄Ȳ
)Hence
(XTX)−1 =1
nsxx
(sxx + nX̄
2 −nX̄−nX̄ n
)
=
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77
-
With this notation we can write
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
)=
(n nX̄nX̄ sxx + nX̄
2
)XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)=
(nȲ
sxy + nX̄Ȳ
)
Hence
(XTX)−1 =1
nsxx
(sxx + nX̄
2 −nX̄−nX̄ n
)
=
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77
-
With this notation we can write
XTX =
(n
∑ni=1Xi∑n
i=1Xi∑n
i=1X2i
)=
(n nX̄nX̄ sxx + nX̄
2
)XTY =
( ∑ni=1 Yi∑n
i=1XiYi
)=
(nȲ
sxy + nX̄Ȳ
)Hence
(XTX)−1 =1
nsxx
(sxx + nX̄
2 −nX̄−nX̄ n
)
=
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 35 / 77
-
Then, assuming that rank(X) = 2 (i.e., that the Xi’s are not allequal), we get
β̂ =
(β̂0β̂1
)= (XTX)−1XTY (3)
=
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
( nȲsxy + nX̄Ȳ)
=
Ȳ − sxysxx X̄sxysxx
i.e.
β̂0 = Ȳ − β̂1X̄ and β̂1 =sxysxx
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 36 / 77
-
Estimated Regression Coefficients: An example
Example 1: In some study, consider the following information:
Y =
16510151322
; X =
1 41 11 21 31 31 4
Now, let us do the required calculations:
XTX =
(6 1717 55
); XTY =
(81261
)Finally,
β̂ =
(6 1717 55
)−1(81261
)=
(0.4394.610
)Hence, β̂0=0.439 and β̂1=4.610.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77
-
Estimated Regression Coefficients: An example
Example 1: In some study, consider the following information:
Y =
16510151322
; X =
1 41 11 21 31 31 4
Now, let us do the required calculations:
XTX =
(6 1717 55
); XTY =
(81261
)Finally,
β̂ =
(6 1717 55
)−1(81261
)=
(0.4394.610
)Hence, β̂0=0.439 and β̂1=4.610.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77
-
Estimated Regression Coefficients: An example
Example 1: In some study, consider the following information:
Y =
16510151322
; X =
1 41 11 21 31 31 4
Now, let us do the required calculations:
XTX =
(6 1717 55
); XTY =
(81261
)Finally,
β̂ =
(6 1717 55
)−1(81261
)=
(0.4394.610
)
Hence, β̂0=0.439 and β̂1=4.610.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77
-
Estimated Regression Coefficients: An example
Example 1: In some study, consider the following information:
Y =
16510151322
; X =
1 41 11 21 31 31 4
Now, let us do the required calculations:
XTX =
(6 1717 55
); XTY =
(81261
)Finally,
β̂ =
(6 1717 55
)−1(81261
)=
(0.4394.610
)Hence, β̂0=0.439 and β̂1=4.610.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77
-
Estimated Regression Coefficients: An example
Example 1: In some study, consider the following information:
Y =
16510151322
; X =
1 41 11 21 31 31 4
Now, let us do the required calculations:
XTX =
(6 1717 55
); XTY =
(81261
)Finally,
β̂ =
(6 1717 55
)−1(81261
)=
(0.4394.610
)Hence, β̂0=0.439 and β̂1=4.610.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 37 / 77
-
Estimated Regression Coefficients: An example
Example 2: For the ozone data used before;
Y =
242237231201
; X =
1 0.021 0.071 0.111 0.15
give
XTX =
(4 .3500
.3500 .0399
), XTY =
(911
76.99
)and, then
(XTX)−1 =
(1.07547 −9.43396−9.43396 107.81671
)Hence, the estimates of the regression coefficients are
β̂ = (XTX)−1XTY =
(253.434−293.531
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77
-
Estimated Regression Coefficients: An example
Example 2: For the ozone data used before;
Y =
242237231201
; X =
1 0.021 0.071 0.111 0.15
give
XTX =
(4 .3500
.3500 .0399
), XTY =
(911
76.99
)and, then
(XTX)−1 =
(1.07547 −9.43396−9.43396 107.81671
)Hence, the estimates of the regression coefficients are
β̂ = (XTX)−1XTY =
(253.434−293.531
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77
-
Estimated Regression Coefficients: An example
Example 2: For the ozone data used before;
Y =
242237231201
; X =
1 0.021 0.071 0.111 0.15
give
XTX =
(4 .3500
.3500 .0399
), XTY =
(911
76.99
)and, then
(XTX)−1 =
(1.07547 −9.43396−9.43396 107.81671
)Hence, the estimates of the regression coefficients are
β̂ = (XTX)−1XTY =
(253.434−293.531
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77
-
Estimated Regression Coefficients: An example
Example 2: For the ozone data used before;
Y =
242237231201
; X =
1 0.021 0.071 0.111 0.15
give
XTX =
(4 .3500
.3500 .0399
), XTY =
(911
76.99
)and, then
(XTX)−1 =
(1.07547 −9.43396−9.43396 107.81671
)Hence, the estimates of the regression coefficients are
β̂ = (XTX)−1XTY =
(253.434−293.531
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77
-
Estimated Regression Coefficients: An example
Example 2: For the ozone data used before;
Y =
242237231201
; X =
1 0.021 0.071 0.111 0.15
give
XTX =
(4 .3500
.3500 .0399
), XTY =
(911
76.99
)and, then
(XTX)−1 =
(1.07547 −9.43396−9.43396 107.81671
)Hence, the estimates of the regression coefficients are
β̂ = (XTX)−1XTY =
(253.434−293.531
)Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 38 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).
I Var(β̂) = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )
= (XTX)−1XTXβ = (XTX)−1(XTX)β = β
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )
= (XTX)−1XTXβ = (XTX)−1(XTX)β = β
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
E(β̂) = E((XTX)−1XTY ) = (XTX)−1XTE(Y )
= (XTX)−1XTXβ = (XTX)−1(XTX)β = β
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Var(β̂) = Var((XTX)−1XTY )
= (XTX)−1XTVar(Y )((XTX)−1XT )T
= (XTX)−1XT (σ2I)((XTX)−1XT )T
= σ2(XTX)−1XT ((XTX)−1XT )T
= σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Var(β̂) = Var((XTX)−1XTY )
= (XTX)−1XTVar(Y )((XTX)−1XT )T
= (XTX)−1XT (σ2I)((XTX)−1XT )T
= σ2(XTX)−1XT ((XTX)−1XT )T
= σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Var(β̂) = Var((XTX)−1XTY )
= (XTX)−1XTVar(Y )((XTX)−1XT )T
= (XTX)−1XT (σ2I)((XTX)−1XT )T
= σ2(XTX)−1XT ((XTX)−1XT )T
= σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Var(β̂) = Var((XTX)−1XTY )
= (XTX)−1XTVar(Y )((XTX)−1XT )T
= (XTX)−1XT (σ2I)((XTX)−1XT )T
= σ2(XTX)−1XT ((XTX)−1XT )T
= σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Var(β̂) = Var((XTX)−1XTY )
= (XTX)−1XTVar(Y )((XTX)−1XT )T
= (XTX)−1XT (σ2I)((XTX)−1XT )T
= σ2(XTX)−1XT ((XTX)−1XT )T
= σ2(XTX)−1XTX(XTX)−1 = σ2(XTX)−1
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Properties of β̂
β̂ has the following properties when XTX is invertible.I E(β̂) = β (i.e., β̂ is unbiased).I Var(β̂) = σ2(XTX)−1
Under normality β̂ ∼ N2(β, σ2(XTX)−1
)
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 39 / 77
-
Quiz
Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)
Solution:We have shown that:
Var(β̂) = σ2(XTX)−1 = σ2
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Hence,
Var(β̂0) = σ2
(1
n+X̄2
sxx
)Var(β̂1) = σ
2 1
sxx=
σ2
sxx
Cov(β̂0, β̂1) = −X̄
sxxσ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77
-
Quiz
Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)
Solution:We have shown that:
Var(β̂) = σ2(XTX)−1 = σ2
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Hence,
Var(β̂0) = σ2
(1
n+X̄2
sxx
)Var(β̂1) = σ
2 1
sxx=
σ2
sxx
Cov(β̂0, β̂1) = −X̄
sxxσ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77
-
Quiz
Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)
Solution:We have shown that:
Var(β̂) = σ2(XTX)−1 = σ2
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Hence,
Var(β̂0) = σ2
(1
n+X̄2
sxx
)Var(β̂1) = σ
2 1
sxx=
σ2
sxx
Cov(β̂0, β̂1) = −X̄
sxxσ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77
-
Quiz
Use Matrix notation for simple linear model to find Var(β̂0),Var(β̂1)and Cov(β̂0, β̂1)
Solution:We have shown that:
Var(β̂) = σ2(XTX)−1 = σ2
1
n+X̄2
sxx− X̄sxx
− X̄sxx
1
sxx
Hence,
Var(β̂0) = σ2
(1
n+X̄2
sxx
)Var(β̂1) = σ
2 1
sxx=
σ2
sxx
Cov(β̂0, β̂1) = −X̄
sxxσ2
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 40 / 77
-
The Ŷ and Residuals Vectors
Let the vector of the fitted values Yi be denoted by Ŷ
Ŷ =
Ŷ1Ŷ2...
Ŷn
In matrix notation, we then have:
Ŷ = X β̂ (4)n×1 n×2 2×1
because:
Ŷ =
Ŷ1Ŷ2...
Ŷn
=
1 X11 X2...
...1 Xn
(β̂0β̂1
)=
β̂0 + β̂1X1β̂0 + β̂1X2
...
β̂0 + β̂1Xn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77
-
The Ŷ and Residuals Vectors
Let the vector of the fitted values Yi be denoted by Ŷ
Ŷ =
Ŷ1Ŷ2...
Ŷn
In matrix notation, we then have:
Ŷ = X β̂ (4)n×1 n×2 2×1
because:
Ŷ =
Ŷ1Ŷ2...
Ŷn
=
1 X11 X2...
...1 Xn
(β̂0β̂1
)=
β̂0 + β̂1X1β̂0 + β̂1X2
...
β̂0 + β̂1Xn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77
-
The Ŷ and Residuals Vectors
Let the vector of the fitted values Yi be denoted by Ŷ
Ŷ =
Ŷ1Ŷ2...
Ŷn
In matrix notation, we then have:
Ŷ = X β̂ (4)n×1 n×2 2×1
because:
Ŷ =
Ŷ1Ŷ2...
Ŷn
=
1 X11 X2...
...1 Xn
(β̂0β̂1
)=
β̂0 + β̂1X1β̂0 + β̂1X2
...
β̂0 + β̂1Xn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77
-
The Ŷ and Residuals Vectors
Let the vector of the fitted values Yi be denoted by Ŷ
Ŷ =
Ŷ1Ŷ2...
Ŷn
In matrix notation, we then have:
Ŷ = X β̂ (4)n×1 n×2 2×1
because:
Ŷ =
Ŷ1Ŷ2...
Ŷn
=
1 X11 X2...
...1 Xn
(β̂0β̂1
)=
β̂0 + β̂1X1β̂0 + β̂1X2
...
β̂0 + β̂1Xn
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 41 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !
where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.
We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix
We can express the matrix result for Ŷ in (4) as follows by using theexpression for β̂ (3):
Ŷ = Xβ̂ = X(XTX)−1XTY
or, equivalently:
Ŷ = H Y (5)n×1 n×n n×1
i.e., H puts a hat on Y !where
H = X(XTX
)−1XT
The square n× n matrix H is called the hat matrix and plays animportant role in the theory of Linear Models. It is clear that Hinvolves only the observations on the predictor variable X.We see from (5) that the fitted values Ŷi can be expressed as linearcombinations of the response variable observations Yi, with thecoefficients being elements of the matrix H.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 42 / 77
-
Hat Matrix: Example
For the ozone data used in Example 2;
H =
1 0.021 0.071 0.111 0.15
( 1.0755 −9.4340−9.4340 107.8167)(
1 1 1 10.02 0.07 0.11 0.15
)
=
.741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159
Thus, for example,
Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77
-
Hat Matrix: Example
For the ozone data used in Example 2;
H =
1 0.021 0.071 0.111 0.15
( 1.0755 −9.4340−9.4340 107.8167)(
1 1 1 10.02 0.07 0.11 0.15
)
=
.741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159
Thus, for example,
Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77
-
Hat Matrix: Example
For the ozone data used in Example 2;
H =
1 0.021 0.071 0.111 0.15
( 1.0755 −9.4340−9.4340 107.8167)(
1 1 1 10.02 0.07 0.11 0.15
)
=
.741240 .377358 .086253 −.204852.377358 .283019 .207547 .132075.086253 .207547 .304582 .401617−.204852 .132075 .401617 .671159
Thus, for example,
Ŷ1 = .741Y1 + .377Y2 + .086Y3 − .205Y4.
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 43 / 77
-
Properties of the Hat matrix
I H is symmetric and idempotent (the latter means H2 = H).
I I −H is symmetric and idempotent.
I HX = X
I (I −H)X = 0
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77
-
Properties of the Hat matrix
I H is symmetric and idempotent (the latter means H2 = H).
I I −H is symmetric and idempotent.
I HX = X
I (I −H)X = 0
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77
-
Properties of the Hat matrix
I H is symmetric and idempotent (the latter means H2 = H).
I I −H is symmetric and idempotent.
I HX = X
I (I −H)X = 0
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77
-
Properties of the Hat matrix
I H is symmetric and idempotent (the latter means H2 = H).
I I −H is symmetric and idempotent.
I HX = X
I (I −H)X = 0
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77
-
Properties of the Hat matrix
I H is symmetric and idempotent (the latter means H2 = H).
I I −H is symmetric and idempotent.
I HX = X
I (I −H)X = 0
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 44 / 77
-
Properties of the fitted vector
Vector of fitted values: Ŷ = Xβ̂.
I EŶ = XβI VarŶ = σ2H
E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.
The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives
Var(Ŷ ) = X[Var(β̂)
]XT
= X[σ2(XTX)−1
]XT
= X(XTX)−1XTσ2
= Hσ2
The derivation using the second relationship gives
Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.
When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2
).
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77
-
Properties of the fitted vector
Vector of fitted values: Ŷ = Xβ̂.I EŶ = Xβ
I VarŶ = σ2H
E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.
The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives
Var(Ŷ ) = X[Var(β̂)
]XT
= X[σ2(XTX)−1
]XT
= X(XTX)−1XTσ2
= Hσ2
The derivation using the second relationship gives
Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.
When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2
).
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77
-
Properties of the fitted vector
Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H
E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.
The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives
Var(Ŷ ) = X[Var(β̂)
]XT
= X[σ2(XTX)−1
]XT
= X(XTX)−1XTσ2
= Hσ2
The derivation using the second relationship gives
Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.
When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2
).
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77
-
Properties of the fitted vector
Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H
E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.
The variancecovariance matrix of Ŷ can be derived using either therelationship Ŷ = Xβ̂ or Ŷ = HY . Applying the rules for variancesof linear functions to the first relationship gives
Var(Ŷ ) = X[Var(β̂)
]XT
= X[σ2(XTX)−1
]XT
= X(XTX)−1XTσ2
= Hσ2
The derivation using the second relationship gives
Var(Ŷ ) = H [Var(Y )]HT = HHTσ2 = Hσ2.
When E is normally distributed, Ŷ ∼ Nn(Xβ,Hσ2
).
Dr. Bisher M. Iqelan (Department of Math.)3: Simple Linear Regression (Matrix Version) 2010-2011, Semester 2 45 / 77
-
Properties of the fitted vector
Vector of fitted values: Ŷ = Xβ̂.I EŶ = XβI VarŶ = σ2H
E(Ŷ ) = E(HY ) = HE(Y ) = HXβ = Xβ.
The variance