Download - Matrix Algebra Notes
-
7/21/2019 Matrix Algebra Notes
1/25
Econometrics - II
Indira Gandhi Institute of Development ResearchJanuary - May Semester 2013
Subrata Sarkar c
Elements Of Matrix Algebra
Start with an example
Yt= 1+ 2X2t+ 3X3t+ + kXkt+ Ut t= 1, 2, n
Writing for each observation
Y1 = 1+ 2X21+ 3X31+ + kXk1+ U1
Y2 = 1+ 2X22+ 3X32+ + kXk2+ U2...
Yn = 1+ 2X2n+ 3X3n+ + kXkn+ Un
Summarize these n equations in a convenient form
Y1Y2...
Yn
=
1 X21 Xk11 X22 Xk2...
... ...
1 X2n Xkn
12...
k
+
U1U2...
Un
or,Y =X + U
Y, X,, U are vectors and matrices
Vector: An ordered sequence of numbers arranged in a row or a column
Y , , U arranged in column
U, Y n element column vector k element column vector
1
-
7/21/2019 Matrix Algebra Notes
2/25
Y = [Y1, Y2 Yn]transpose ofYU = [U1, U2 Un] transpose ofU
= [1, 2 k] transpose of
X=
1 X21 Xk11 X22 Xk2...
... ...
1 X2n Xkn
X is a matrix
Matrix: A rectangular array of elements.Order of a Matrix = number of rows number of columns
=n k (number of rows being always written first)
Observations:
1. A column vector of n elements i.e.
Y =
Y1Y2...Yn
is a matrix of ordern 1
2. A row vector of k elements i.e. Z = [Z1 Z2 Zk] is a matrix of order1 k
3. Representing the X matrix
X= [X1n1 X2n1 Xkn1 ]
or
X=
S1S2...
Sn
2
-
7/21/2019 Matrix Algebra Notes
3/25
4. Transpose of a matrix
Xnk= [Xij] Xkn = [Xji]
Example
X=
1 6 43 2 24 1 15 3 5
X =
1 3 4 56 2 1 3
4 2 1 5
1. Operations on Vectors
(a) Multiplication by a scalar
2
23
4
=
2 22 3
2 4
=
46
8
(b) Addition of two vectorsU+ V= Sum of corresponding elementsorder has to be the same
(c) Linear combination K1U +K2V where K1 and K2 are con-stants.
(d) Vector Multiplication
ab= [1 2 3]
456
= 1 4 + 2 5 + 3 6
a = 1 3b = 3 1
} number of elements have to be same
3
-
7/21/2019 Matrix Algebra Notes
4/25
A special vector: S the sum vector
S=
11...1
thereforeas=
ni=1
ai
n 1
2. Operations on Matrices
(a) Multiplication by a scalarK A= {Kaij}
(b) Addition of two matrices sum of corresponding elements
(c) Equality of Matrices orders have to be same
(d) Matrix Multiplication: Ank Bkm
A Bn k k m
n m
=
a11 a1ka21 a2k
...an1 ank
b11 b1mb21 b2m
...bk1 bkm
=
a1b1 a
1bma2b1 a
2bm...
anb1 a
nbm
The two matrices have to be conformable.
An example: 2 33 1
4 2
6 32 2
=
2 6 + 3 2 2 3 + 3 23 6 + 1 2 3 3 + 1 2
4 6 + 2 2 4 3 + 2 2
3 2 2 2
A special case
a11 a1m
...an1 anm
12...
m
=
a
1a2...
an
4
-
7/21/2019 Matrix Algebra Notes
5/25
3. Some Special Matrices
(a) Diagonal matrix
Ann=
a11a22
. . .
ann
has to be square
(b) The identity matrix
Inn=
11
. . .
1
nn
(c) Symmetric matrixA = A
(d) A scalar matrix
=
. . .
=I
(e) Idempotent matrix. Has to be square
A = A2
A = A2
=A3
5
-
7/21/2019 Matrix Algebra Notes
6/25
4. Some Properties of Matrices
(a) (AB) =B A, (ABC) =CBA
(b) (A + B) + C=A + (B+ C)
(c) (AB)C=A(BC)
(d) A(B+ C) =AB+ AC
(e) AI=A
(f) (A + B) =A + B
5. Trace of a Square Matrix
tr(A) =n
i=1
aij
Properties of Trace
tr(ABC) =tr(BC A) =tr(CAB)
6. Matrix Inverse
In algebra we have ab = 1 b = 1a
In matrix algebra we ask that given Ann , does there B |AB= In?
Answer: If columns of A are linearly independent, then there B suchthat AB = I. In that case B is denoted as A1 i.e. AA1 =I
6
-
7/21/2019 Matrix Algebra Notes
7/25
Linear Independence: a1, a2, anare linearly independent. If not then
ai can be written as a linear combination of the other ais
Theorem: If all columns of A are linearly independent, then so are allthe rows. Then there exists C such that
CA= I
NowC = CI
= CAB
= IB
ThereforeC = B. = A
1
Therefore if A is square matrix with all columns (rows) linearly in-dependent, then there exists a unique matrix called the inverse of A,denoted by A1 such that
AA1 =A1A= I . A is non singular
7. Properties of Inverse
(a) [A1]1 =A
(b) [A]1 = [A1]
(c) [AB]1 =B1A1
8. Calculations of Inverse
A=
a11 a12a21 a22
Replace each element by its minor a22 a21a12 a11
7
-
7/21/2019 Matrix Algebra Notes
8/25
Sign the minors -, i.e. get cofactors (1)i+j a22 a21a12 a11
Transpose a22 a12a21 a11
Adj(A)
Get determinant a11a22 a12a21=|A |
Divide each element ofAdj(A) by| A |
Therefore A1 = 1
|A |[adj(A)]
A=
a11 a12 a13a21 a22 a23a11 a31 a33
Step 1: Minor
(a22a33 a23a32) (a21a33 a31a32) (a21a32 a22a31)
Step 2: Cofactor
+ + + + +
Step 3: Transpose Adjoint
8
-
7/21/2019 Matrix Algebra Notes
9/25
Step 4: Determinant
|A | = a11[a22a33 a23a32]
= a12[a21a33 a23a31]
= +a13[a21a32 a22a31]
Step 5: Inverse
Divide every element of Adjoint (Step 3) by determinant (Step 4).
9. The Rank of a Matrix
The rank of a matrix A, not necessarily square, is the maximum num-ber of linearly independent columns (or rows).
The maximum number of linearly independent columns of A = Themaximum number of linearly independent rows of A.
rank is unique and is denoted by (A)
(A) min[m, n]
When(A) =m < n A has full row rank
When(A) =n < m A has full column rank
If A is square matrix of order n, with full row(column) rank then A isnon-singular
Example A
1 2 3 41 0 1 12 2 4 53 6 7 4
9
-
7/21/2019 Matrix Algebra Notes
10/25
Summary of Basic Matrix Algebra
1. Matrix: A rectangular array of elements.
A=
1 2 3 24 5 6 7
7 8 9 2
34
={aij}
A is a 3 (rows) 4 (columns) matrtix
2. Row vector: x= [1 2 3 4]14
3. Column vector: y=
78
9
31
4. Diagonal Matrix: D=
1 2
3
5. Symmetric C matrix: {aij}= {aji}
A= 1 22 7
6. Transpose of a matrix:
A=
1 2 34 5 6
23
A =
1 42 53 6
32Symmetric matrix : A= A
10
-
7/21/2019 Matrix Algebra Notes
11/25
7. Rank of Matrix: The number of linearly independent rows (columns)
Rank Am n
min[m, n]
8. Square Matrix: Ann
if Rank A = n Then A has an inverse :
AA1 =I=A1A
whereInn=
1
11
. . .
1
9. Addition of matrices: An m and Bn m
same order
A + B = {aij} + {bij}= {aij+ Bij}
10. Multiplication:
Anm Bmp= ABnp.[Example]
conformable
11. (AB) =B A
12. (AB)1 = B1A1 assuming A and B are square matrices with fullrank
11
-
7/21/2019 Matrix Algebra Notes
12/25
Quadratic Form and Matrix Derivatives
1. Quadratic Form
Consider the expression q1 = 2X21 +X1X2+ X
23 Calling X the
column vector of Xs, ie.X = [X1, X2 Xn], a quadratic form can beput in the form q= XAXwith A symmetric. A is uniqueonce theorder of X is chosen. A has
in the diagonal ie. aii, the coefficient attached to X2i
in the off-diagonal, aij, 12 , the coefficient attached to XiXj
In our example: A=
2 1/2 01/2 0 0
0 0 1
Example 1:A =
2 11 1
XAX = 2X21 + 2X1X2+ X22 0 X
= (X1+ X2)2 + X21 >0 X= 0
Example 2:A =
1 11 1
XAX = X21 + 2X1X2+ X22
= (X1+ X2)2 0 X
but X= 0 such that XAX= 0
Definition:
12
-
7/21/2019 Matrix Algebra Notes
13/25
A quadratic form is said to be positive definite ifXAX >0 for
allX= 0 A quadratic form is said to be positive semi-definite ifXAX0
for all X and X= 0such thatXAX= 0
Remarks:
(a) A matrix is said to be n.n.d if it is either P.D or P.S.D.
(b) The concept of n.d and n.s.d can be defined similarly (by reversingthe sign).
(c) A symmetric matrix A is said to be P.D (P.S.D) if the associatedquadratic form is in P.D (P.S.D)
There are three equivalent conditions for a symmetric matrix A to beP.D. These are iffconditions
(a) The matrix A is non-singular
(b) a non singular matrix P, such that PP =A
(c) a non singular matrix Q, such that Q=I
Some more properties related to Quadratic Forms
(a) Let B be any n k matrix. Then
i. BB (order ofk k) is n.n.d
ii. BB is p.d if rank(B) = k
iii. BB is p.s.d if rank(B) < k
(b) Ais p.dB is n.n.d
A + B is p.d.
(c) A p.d. n nB any n k
13
-
7/21/2019 Matrix Algebra Notes
14/25
i. BAB (order ofk k) is n.n.d
ii. B
AB is p.d if rank(B) = kiii. BAB is p.s.d if rank(B) < k
2. Matrix Derivatives
(a) Scalar Function
Y = f(X) where X is a vector
Y = AX1 X2 =f(X1, X2) X=
X1
X2 Definition
Y
X =
Y/X1Y/X2...Y/Xn
gradient vector
Y
X=
Y
X1
Y
Xn
gradient vector
Our example YX
= AX11 X2AX1 X
12
i. Special case: Linear function
Y =P1X1+ P2X2+ + PnXn= PX=XP
Y
X= [P1 P2 Pn] =P
Y
X = P
ii. Special case: Quadratic form
Y = XAX A symmetric
Y
X = 2AX Example A=
2 11 1
14
-
7/21/2019 Matrix Algebra Notes
15/25
(b) Vector Function
Ym1= Fm1(Xn1)
Y1 = F1(X)
Y2 = F2(X)...
Ym = Fm(X)
Then
Y
X = Y1X1
Y1X2
Y1Xn
.
..YmX1
YmX2
YmXn
The Jacobian Martix
Special case: Linear vector functions
Y1 = P
1X
Y2 = P
2X...
Ym = P
mX
Y =P X P =
P1P2...
Pm
Then Y
X=P
X = IX
Y
X= I
15
-
7/21/2019 Matrix Algebra Notes
16/25
(c) Application of Derivative
q = 1
2XAX+ bX+ c where A is p.d.
q
X =
1
22AX+ b
q
X = AX+ b= 0 (F.O.C)
Therefore AX = b
X = A1b [A1 since a is p.d.]
Also 2
qXX
=A is p.d. X defines a minimum of q
(The Hessian Matrix)
Proof:
Let X=X + Z
Then XAX= (X+Z)A(X+Z) =XAX+2XAZ+ZAZ
Therefore q = 1
2XAX+2XAZ+
1
2ZAZ+ b(X + Z) + c
q = 1
2XAX + bX + c + XAZ+
1
2ZAZ+ bZ
q = q + (bA1)AZ+1
2ZAZ+ bZ
= q bZ+1
2ZAZ+bZ
q = q +1
2
ZAZ >0 for Z= 0
Thereforeq > q for X=X
16
-
7/21/2019 Matrix Algebra Notes
17/25
Matrix Statistics
(a) Random Vectors and Matrices
IfX1, X2 Xn are random variables then X=
X1X2
...Xn
is a ran-
dom vector. Elements are r.v.s.
Likewise, W ={Wij} is a random matrix when Wijs are all ran-dom variables.
(b) Expectation
E(X) = [E(Xi)] =
E(X1)E(X2)
...E(Xn)
Let E(Xi) = for all i. E(X) =
E(X1)
...E(Xn)
=
...
=Sn
sum vector
Properties of Expectation: Let A, B, C, U be constants
i. E(U) =U
ii. E(AX) =AE(X)iii. E(X+ Y) =E(X) + E(Y)
iv. E(BX C) =BE(X)C
v. E(W1W2) =E(W1)E(W2) when W1 andW2 are iid
17
-
7/21/2019 Matrix Algebra Notes
18/25
(c) Variance and Co-variance Matrix
X=
X1X2
...Xn
Let E(X) =
E(X1)...
E(Xn)
=
1...
n
Therefore E(Xi) =i
i. There are n expectations.
ii. There are variances and co-variances
there are n variances E[Xi i]2 =ii> 0
there are (n)(n 1)2
covariances E[Xii][Xj j ] =
ij i=j ij =ji
The variance-covariance matrix of X can be written as
V(X) = E[(X )(X )] where E(X) = =
U1U1...
Un
= {E(Xi i)(Xj j)}= {ij} V
In the diagonal we have variances and in the off-diagonal we havecovariances. The matrix is symmetric.
Remarks:
i. If the Xis are uncorrelated then ij = 0 i, j, i = j.ThenV= diag{ii}
ii. In addition, if there is homoskedasticity i.eii= 2 i, then
V =2In
iii. IfE(X) = 0 then V(X) =E(XX)
(d) Linear Transformation
Consider X withE(X) =, V(X) =VDefineY =AX is a L.T of X
18
-
7/21/2019 Matrix Algebra Notes
19/25
Then E(Y) = A
V(Y) = AV A
Proof: Y =AX
E(Y) = E(AX) =AE(X) =A
V(Y) = E(Y E(Y))(Y E(Y))
= E[AX A][AX A]
= E[A(X )][A(X )]
= E[A(X )(X )A]
= A E(X )(X )
A
= AV A
Now consider the scalar linear transformation Y =ZX
V(Y)>0 if Y is not a constant. i.e. ifZXis not a constant.
V(Y) = 0 if Y is a constant. i.e. if ZX is a constant. i.e X are linearlydependent.
V(Y) =ZV Z >0 Z= 0 if the Xi are linearly independent= 0 if theXi are linearly dependent
Conclusion: The variance-covariance matrix is always p.d, except in caseswhere the Xis are linearly dependent, in which case it is a p.s.d matrix.
Corollary:E(X) = V(X) =V V positive def.Then it is possible to get a standard vector through a linear transformation.i.e
E(Y) = 0V(Y) =I
Define:Y = Q[X ]
E(Y) = QE(X ) =Q[E(X) ] = 0
V(Y) = QV Q =I for some Q
19
-
7/21/2019 Matrix Algebra Notes
20/25
(e) Expectation of a Quadratic Form
XE(X) = 0 V(X) =V
q= XAXwhere A is a symmetric matrix of constants.
ThenE(q) =E(XAX) =tr AV
Trace: trace of a square matrix is the sum of its diagonal elements.
Proof: XAX is a scalar, and so equal to its trace.
XAX=tr(XAX)
E(XAX) =E(tr XAX)
Trace is commutative
E(XAX) =E(tr AXX)
Trace is a linear operator
E(XAX) = tr E(AXX)
= tr AE(XX)
= tr AV
Example:V =2InA is idempotent of rank K
Therefore E(XAX) =tr A2I=2tr A= 2K
[Rank (A) = tr (A) since A is idempotent]
(f) Multinomial Normal Distribution and Related Distributions
20
-
7/21/2019 Matrix Algebra Notes
21/25
i. Introduction
Xi i = 1, 2, n be n independent normal random vari-ables with
E(Xi) =iV(Xi) =
2i
Xi N(i, 2i )
A. Density ofXi f(Xi) = 1
22ie
1
22i
(Xii)2
B. Joint density ofX1, X2 Xn, whenXis are independent
is the product of the individual densities.
f(X1, X2 Xn) = (2)n
2 (n
i=1
2i )
1
2 e
1
2
i
1
2i
(Xii)2
Lets write the above in vector-matrix notation
X=
X1X2
...Xn
E(X) =
12...
n
=
V(X) =
21
. . .
2n
= V
Note:
|V|= 21 22
2n , V
1 =
1/21. . .
1/2n
i1
2i(Xi i)
2 = 1
21(X1 1)
2 + 1
22(X2 2)
2 + + 1
2n(Xn n
is a quadratic form (X )V1(X )
Therefore f(X1 Xn) = (2)n
2 |V|1
2 e1
2(X)V1(X)
21
-
7/21/2019 Matrix Algebra Notes
22/25
ii. Formal definition: The random vector X, withE(X) = and
V(X) =V, is said to be normally distributed iff:f(X) = (2)
n
2 |V|1
2 e1
2(X)V1(X)
We then write XN(, V)
When XN(0, In) then we say X is a standard normalvector.
iii. Properties of normal distribution
A. If XN(, V) and Y = AX +
m 1 m n n 1 m 1
Then Y N(A + ,AVA
)E(Y) = A +
V(Y) = AV A is p.d. sincer(A) =m
B. The orthogonal transformation of a standard normal vec-tor is also a standard normal vector.
Z = CX orthogonal transformation
n 1 n n n 1 C1 =C
CC1 =CC
CC =ICC =I
E(Z) = CE(X) = 0
V(Z) = CV C
= CIC
= CC
= I
Corollary: If X N(, V) then by a suitable transfor-mation we can get a S.N.V.
Since V is pd Q|QV Q =I
Y =Q(X ) E(Y) = 0
V(Y) =QV Q =I
22
-
7/21/2019 Matrix Algebra Notes
23/25
Y N(0, I)
C. For normal variables zero covariance independence
XN(, V) X=
X1X2
SX1(nS)X1
with E(X) =
E(X1)E(X2)
=
12
V(X) =
V11 V12V21 V22
Zero covariance between X1, X
2 V
12= V
21= 0
Then f(X) = (2)n
2 (|V11||V22|)
1
2 e
1
2[(X11)(X22)]
V111
V122
X1 1X2 2
= (2)n
2 |V11|
1
2 |V22|
1
2 e1
2[(X11)V
1
11 (X11)+(X22)V
1
22 (X22)]
= (2)S
2 |V11|
1
2 e1
2(X11)V
1
11 (X11).(2)
nS
2 |V22|
1
2
e1
2(X22)V
1
22 (X22) indep.
iv. The chi-squared
XN(0, In) independent
X21 + X22 + + X
2n
2(n)
23
-
7/21/2019 Matrix Algebra Notes
24/25
Characterization
A. X21 + X
22 + X
2n X
XThereforeX N(0, In)
XX 2(n)
B. Y N(, V)
(Y )V1(Y ) 2(n)
Since V is p.d , Q|QV Q =I
X=Q(Y ) X is normal
E(X) = 0 V(X) =QV Q
=I
ThereforeX N(0, I)
XX 2(n)
(Y )QQ(Y ) 2(n)
QV Q= I
Q1QV Q =Q1
V Q =Q1
V(Q)(Q)1 =Q1(Q)1
V = (QQ)1
V1 =QQ)
Therefore (Y )V1(Y ) 2(n)
C.
Z N(0, In)
Z = Z1
Z2 SX1
(nS)X1
Z1 N(0, IS) Z2 N(0, InS) independent
Z1Z1 2(S) Z
2Z2 2(nS)independent
24
-
7/21/2019 Matrix Algebra Notes
25/25
Now observe
Z1 = [IS0]Z Z1 = AZ
S 1 S n n 1
Z1Z1= ZAAZ = Z
IS
0
[IS0]Z=Z
IS 00 0
Z
Therefore Z1Z1 = ZMZ with M idempotent of r(M) = S
ZMZ2(S)
Z2 = [0 InS]
Z1Z2
= Z
Z2Z2 = Z
0 00 InS
Z=ZMZ M ID r(M) =n S
Therefore ZMZ 2(nS)
Also M =I M
MM =M(I M) =M M.M=M M= 0
Theorem: If Z N(0, In) and M is an idempotent matrtix of rankS, then
ZMZ 2(S)
Z(1 M)Z 2(nS)
and 2(S) and 2(nS) are independent since M(I-M) = 0
25