7. introduction to large sample theory -...
TRANSCRIPT
![Page 1: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/1.jpg)
7. Introduction to Large Sample Theory
Hayashi p. 88-97/109-133
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 1
![Page 2: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/2.jpg)
Introduction
We looked at finite-sample properties of the OLS estimator and itsassociated test statistics
These are based on assumptions that are violated very often
The finite-sample theory breaks down if one of the following threeassumptions is violated:
- the exogeneity of regressors- the normality of the error term, and- the linearity of the regression equation
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 2
![Page 3: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/3.jpg)
Introduction (cont’d)
Asymptotic or large-sample theory provides an alternative approachwhen these assumptions are violated
It derives an approximation to the distribution of the estimator and itsassociated statistics assuming that the sample size is sufficiently large
Rather than making assumptions on the sample of a given size, large-sample theory makes assumptions on the stochastic process that genera-tes the sample.
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 3
![Page 4: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/4.jpg)
Introduction (cont’d)
The two main concepts in asymptotics relate to consistency and asymptoticnormality.
Some intuition:
Consistency: the more data we get, the closer we get to knowing thetruth (or we eventually know the truth)
Asymptotic normality: as we get more and more data, averages of randomvariables behave like normally distributed random variables.
Example: Establishing consistency and asymptotic normality of an i.i.d.random sample X1, . . . , XN with E(Xi) = µ and var(Xi) = σ2.
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 4
![Page 5: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/5.jpg)
Introduction (cont’d)
The main probability theory tools for asymptotics:
The probability theory tools for establishing consistency of estimators are:
• Laws of Large Numbers (LLNs)
– A LLN is a result that states the conditions under which a sampleaverage of random variables converges to a population expectation.
– LLNs concern conditions under which the sequence of sample meanconverges either in probability or almost surely
– There are many LLN results (eg. Chebychev’s LLN, Kolmongo-rov’s/Khinchine’s LLN, Markov’s LLN)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 5
![Page 6: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/6.jpg)
Introduction (cont’d)
The probability tools for establishing asymptotic normality are:
• Central Limit Theorems (CLTs)
– CLTs are about the limiting behaviour of the difference between asample mean and its expected value
– There are many CLTs (eg. Lindeberg-Levy CLT, Lindeberg-Feller CLT,Liapounov’s CLT)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 6
![Page 7: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/7.jpg)
Basic concepts of large sample theory
Using large sample theory, we can dispense with basic assumptions fromfinite sample theory
1.2 E(εi|X) = 0: strict exogeneity
1.4 V ar(ε|X) = σ2I: homoscedasticity
1.5 ε|X ∼ N(0, σ2In): normality of the error term
Approximate/assymptotic distribution of b, and t- and the F-statistic canbe obtained
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 7
![Page 8: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/8.jpg)
Modes of convergence - Convergence in probability
{zn}: sequence of random variables
{zn}: sequence of random vectors
Convergence in probability:
A sequence {zn} converges in probability to a constant α if for any ε > 0
limn→∞
P (|zn − α| > ε) = 0
Short-hand we write: plimn→∞
zn = α or zn →pα or zn − α→
p0
Extends to random vectors:
If limn→∞
P (|zkn − αk| > ε) = 0 ∀ k = 1, 2, ...,K, then zn →pα
where znk is the k-th element of zn and αk the k-th element of α
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 8
![Page 9: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/9.jpg)
Modes of convergence - Almost Sure Convergence
Almost Sure Convergence: A sequence of random scalars {zn} convergesalmost surely to a constant α if:
Prob(
limn→∞
zn = α)
= 1
We write this as “zn →a.s. α.” The extension to random vectors is analogousto that for convergence in probability.
Note: This concept of convergence is stronger than convergence in proba-bility ⇒ if a sequence converges a.s., then it converges in probability.
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 9
![Page 10: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/10.jpg)
Modes of convergence - Convergence in mean square
Convergence in mean square: limn→∞
E[(zn − α)2
]= 0 or zn →
m.s.α
The extension to random vectors is analogous to that for convergence inprobability:
zn →m.s. α if each element of zn converges in mean square to thecorresponding component of α.
Convergence in mean square extend to random vectors
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 10
![Page 11: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/11.jpg)
Modes of convergence - Convergence to a Random Variable
In the above definitions of convergence, the limit is a constant. However,the limit can also be a random variable.
We say that a sequence of K-dimensional random variables {zn} convergesto a K-dimensional random variable z and write zn →p z if {zn − z}converges to 0:
“zn →p
z” is the same as “zn − z→p
0.”
Similarly,
“zn →a.s.
z” is the same as “zn − z→a.s.
0,”
“zn →m.s.
z” is the same as “zn − z →m.s.
0.”
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 11
![Page 12: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/12.jpg)
Modes of convergence - Convergence in distribution
Convergence in distribution:
Let {zn} be a sequence of random scalars and Fn be the cumulativedistribution function (c.d.f.) of zn.
We say that {zn} converges in distribution to a random scalar z if thec.d.f. Fn of zn converges to the c.d.f. F of z at every continuity point of F .
We write “zn →d z” or “zn →L z” and call F the asymptotic (or limit orlimiting) distribution of zn.
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 12
![Page 13: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/13.jpg)
Modes of convergence - Convergence in distribution
Convergence in probability is stronger than convergence in distribution, i.e.,
“zn →p
z” ⇒ “zn →d
z.”
A special case of convergence in distribution is that z is a constant (a trivialrandom variable).
The extension to a sequence of random vectors is immediate: zn →d z ifthe joint c.d.f. Fn of the random vector zn converges to the joint c.d.f. Fof z at every continuity point of F .
Note: For convergence in distribution, unlike the other concepts of conver-gence, element-by-element convergence does not necessarily mean conver-gence for the vector sequence.
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 13
![Page 14: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/14.jpg)
Weak Law of Large Numbers (WLLN) according to Khinchine
{zi} i.i.d. with E(zi) = µ, then zn = 1n
∑ni=1 zi
we have:
zn →pµ or
limn→∞
P (|zkn − µ| > ε) = 0 or
plim zn = µ
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 14
![Page 15: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/15.jpg)
Extensions of the Weak Law of Large Numbers (WLLN)
The WLLN holds for:
Extension (1): Multivariate Extension (sequence of random vectors {zi})
Extension (2): Relaxation of independence
Extension (3): Functions of random variables h(zi)
Extension (4): Vector valued functions f(zi)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 15
![Page 16: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/16.jpg)
Central Limit Theorems (Lindeberg-Levy)
{zi} i.i.d. with E(zi) = µ and V ar(zi) = σ2. Then for zn = 1n
∑ni=1 zi:
√n(zn − µ)→
dN(0, σ2
)or
zn − µa∼ N
(0, σ
2
n
)or zn
a∼ N(µ, σ
2
n
)
Remark: Reada∼ ’approximately distributed as’
CLT also holds for multivariate extension: sequence of random vectors {zi}
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 16
![Page 17: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/17.jpg)
Useful lemmas of large sample theory
Lemma 1:
zn →pα with a as a continuous function which does not depend on n then:
a(zn)→pa(α) or plim
n→∞a(zn) = a
(plimn→∞
(zn)
)Examples:
xn →pα ⇒ ln(xn)→
pln(α)
xn →pβ and yn →
pγ ⇒ xn + yn →
pβ + γ
Yn →p
Γ ⇒ Y−1n →p
Γ−1
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 17
![Page 18: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/18.jpg)
Useful lemmas of large sample theory (continued)
Lemma 2:
zn →d
z then:
a(zn)→da(z)
Examples:
zn →dz, z ∼ N(0, 1) ⇒ z2 ∼ χ2(1)
zn →dN(0, 1)
z2 →dχ2(1)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 18
![Page 19: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/19.jpg)
Useful lemmas of large sample theory (continued)
Lemma 3:xn →
dx and yn →
pα then:
xn + yn →d
x +α
Examples:
xn →dN(0, 1), yn →
pα ⇒ xn + yn →
dN(α, 1)
xn →d
x, yn →p
0 ⇒ xn + yn →d
x
Lemma 4:xn →
dx and yn →
p0 then:
xn · yn →p
0
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 19
![Page 20: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/20.jpg)
Useful lemmas of large sample theory (continued)
Lemma 5:xn →
dx and An →
pA then:
An · xn →d
A · x
Example:
xn →dMVN(0,Σ)
An · xn →dMVN(0,AΣA′)
Lemma 6:xn →
dx and An →
pA then:
x′nA−1n xn →
dx′A−1x
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 20
![Page 21: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/21.jpg)
8. Time Series Basics(Stationarity and Ergodicity)
Hayashi p. 97-107
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 21
![Page 22: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/22.jpg)
Dependence in the data
Certain degree of dependence in the data in time series analysis; only onerealization of the data generating process is given
CLT and WLLN rely on i.i.d. data, but dependence in real world data
Examples:
Inflation rate
Stock market returns
Stochastic process: sequence of r.v.s. indexed by time {z1, z2, z3, ...} or {zi}with i = 1, 2, ...
A realization/sample path: One possible outcome of the process
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 22
![Page 23: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/23.jpg)
Dependence in the data - theoretical consideration
If we were able to ’run the world several times’, we had different realizationsof the process at one point in time
⇒ We could compute ensemble means and apply the WLLN
As the described repetition is not possible, we take the mean over the onerealization of the process
Key question: Does 1T
∑Tt=1 xt →p E(x) hold?
Condition: Stationarity of the process
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 23
![Page 24: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/24.jpg)
Definition of stationarity
Strict stationarity:
The joint distribution of zi, zi1, zi2, ..., zir depends only on the relativeposition i1 − i, i2 − i, ..., ir − i but not on i itself
In other words: The joint distribution of (zi, zir) is the same as the jointdistribution of (zj, zjr) if i− ir = j − jr
Weak stationarity:
- E(zi) does not depend on i
- Cov(zi, zi−j) depends on j (distance), but not on i (absolute position)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 24
![Page 25: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/25.jpg)
Ergodicity
A stationary process is also called ergodic if
limn→∞
E [f(zi, zi+1, ..., zi+k) · g(zi+n, zi+n+1, ..., zi+n+l)] =
E [f(zi, zi+1, ..., zi+k)] · E [g(zi+n, zi+n+1, ..., zi+n+l)]
Ergodic Theorem:
Sequence {zi} is stationary and ergodic with E(zi) = µ, then
zn ≡ 1n
∑ni=1 zi →
a.s.µ
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 25
![Page 26: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/26.jpg)
Martingale difference sequence
Stationarity and Ergodicity are not enough for applying the CLT. To derivethe CAN property of the OLS-estimator we assume:
{gi} = {xiεi}
{gi} is a stationary and ergodic martingale difference sequence (m.d.s.):
E(gi|gi−1,gi−2, ...,gi−j) = 0
⇒ E(gi) = 0
Implications of m.d.s. when 1 ∈ xi:
εi and εi−j are uncorrelated, i.e. Cov(εi, εi−j) = 0
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 26
![Page 27: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/27.jpg)
Large sample assumptions for the OLS estimator
(2.1) Linearity: yi = x′iβ + εi ∀ i = 1, 2, ..., n
(2.2) Ergodic Stationarity: the (K + 1)-dimensional vector stochasticprocess {yi,Xi} is jointly stationary and erogodic
(2.3) Orthogonality/predetermined regressors: E(xik · εi) = 0If xik = 1 ⇒ E(εi) = 0 ⇒ Cov(xik, εi) = 0
This can be written as E[xi · (yi− x′iβ)] = 0 or E(gi) = 0, where
gi ≡ xi · ε.
(2.4) Rank condition: E(xix′i
KxK
) ≡ ΣXX is non-singular
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 27
![Page 28: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/28.jpg)
Large sample assumptions for the OLS estimator (cont’d)
(2.5) Martingale Difference Sequence (M.D.S): gi is a martingaledifference sequence with finite second moments. It follows that;
i. E(gi) = 0,
ii. The K ×K matrix of cross moments E(gig′i) is nonsingular
iii. S ≡ Avar(g) = E(gig′i), where (g) ≡ 1
n
∑i gi. (Avar(g) is the variance
of the asymptotic distribution of√ng)
See Hayashi pp. 109-113
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 28
![Page 29: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/29.jpg)
Large sample distribution of the OLS estimator
We get for b = (X′X)−1X′y:
bn︸︷︷︸ =[1n
∑ni=1 xix
′i
]−1 1n
∑ni=1 xiy
′i
n indicates the dependenceon the sample size
Under WLLN and lemma 1:
bn →pβ
√n(bn − β)→
dMVN (0, Avar(b)) or b
a∼MVN(β, Avar(b)n
)⇒ bn is consistent, asymptotically normal (CAN)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 29
![Page 30: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/30.jpg)
How to estimate Avar(b)
Avar(b) = Σ−1xxE(gig′i)Σ−1xx with gi = Xiεi
1n
∑ni=1 xix
′i →p E(xix
′i)
Estimation of E(gig′i): S = 1
n
∑e2ixix
′i →p E(gig
′i)
⇒ Avar(b) =
[1
n
n∑i=1
xix′i
]−1S
[1
n
n∑i=1
xix′i
]−1→p
Avar(b) = E(xix′i)−1E(gig
′i)E(xix
′i)−1
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 30
![Page 31: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/31.jpg)
Developing a test statistic under the assumption of conditional ho-moskedasticity
Assumption: E(ε2i |xi) = σ2
Avar(b) =
[1
n
n∑i=1
xix′i
]−1σ2 1
n
n∑i=1
xix′i
[1
n
n∑i=1
xix′i
]−1
= σ2
[1
n
n∑i=1
xix′i
]−1
with S = 1n
∑ni=1 e
2i1n
∑ni=1 xix
′i
Note: 1n
∑ni=1 e
2i is a biased estimate for σ2
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 31
![Page 32: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/32.jpg)
White standard errors
Adjusting the test statistics to make them robust against violations ofconditional homoskedasticity
t-ratio
tk =bk − βk√[
[1n∑ni=1 xix
′i]−1 1
n
∑ni=1 e
2ixix
′i[
1n
∑ni=1 xix
′i]−1
n
]kk
a∼ N(0, 1)
Holds under H0 : βk = βk
F-ratio
W = (Rb− r)′
[R
Avar(b)
nR′
]−1(Rb− r)′
a∼ χ2(#r)
Holds under H0 : Rβ − r = 0; allows for nonlinear restrictions on β
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 32
![Page 33: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/33.jpg)
We show that bn = (X′X)−1X′y is consistent
bn = [1n∑ni=1 xix
′i]−1 1
n
∑ni=1 xiy
′i
⇒ bn − β︸ ︷︷ ︸ =[1n
∑xix′i
]−1 1n
∑xiεi
sampling error
We show: bn →pβ
When sequence {yi,xi} allows application of WLLN
⇒ 1n
∑ni=1 xix
′i →p E(xix
′i)
1n
∑ni=1 xiε→
pE(xiεi)→
p0
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 33
![Page 34: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/34.jpg)
We show that bn = (X′X)−1X′y is consistent (continued)
Lemma 1 implies:
bn − β =
[1
n
∑xix′i
]−11
n
∑xiεi
→p
E(xix′i)−1E(xiεi)
→p
E(xix′i)−1 · 0 = 0
bn = (X′X)−1X′y is consistent
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 34
![Page 35: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/35.jpg)
We show that bn = (X′X)−1X′y is asymptotically normal
Sequence {gi} = {xiεi} allows applying CLT for 1n
∑xiεi = g
√n(g − E(gi))→
dMVN(0,Σ−1xxE(gig
′i)Σ−1xx )
√n(bn − β) =
[1n
∑xix′i
]−1√ng
Applying lemma 5:
An =[1n
∑xix′i
]−1 →p
A = Σ−1xx
xn =√ng→
dx→
dMVN(0, E(gig
′i))
⇒√n(bn − β)→
dMVN(0,Σ−1xxE(gig
′i)Σ−1xx )
⇒ bn is CAN
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 35
![Page 36: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/36.jpg)
9. Generalized Least Squares
Hayashi p. 54-59
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 36
![Page 37: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/37.jpg)
Assumptions of GLS
Linearity: yi = x′iβ + εi
Full rank: rank(X) = K
Strict exogeneity: E(εi|X) = 0
⇒ E(εi) = 0 and Cov(εi, xik) = E(εixik) = 0
NOT assumed: V ar(ε|X) = σ2In
Instead:
V ar(ε|X) = E(εε′|X) =
V ar(ε1|X) Cov(ε1, ε2|X) . . . Cov(ε1, εn|X)
Cov(ε1, ε2|X) V ar(ε2|X)...
Cov(ε1, ε3|X) Cov(ε2, ε3|X) V ar(ε3|X)... . . . ...
Cov(ε1, εn|X) . . . V ar(εn|X)
⇒ V ar(ε|X) = E(εε′|X) = σ2V(X)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 37
![Page 38: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/38.jpg)
Deriving the GLS estimator
Derived under the assumption that V(X) is known, symmetric and positivedefinite
⇒ V(X)−1 = C′C
Transformation:y = Cy
X = CX
⇒ y = Xβ + ε
Cy = CXβ + Cε
y = Xβ + ε
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 38
![Page 39: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/39.jpg)
Least squares estimation of β using transformed data
βGLS = (X′X)−1X′y
= (X′C′CX)−1X′C′Cy
= (X′1
σ2V−1X)−1X′
1
σ2V−1y
=[X′[V ar(ε|X)]−1X′
]−1X′ [V ar(ε|X)]
−1y
GLS estimator is the best linear unbiased estimator (BLUE)
Problems:
Difficult to work out the asymptotic properties of βGLS
In real world applications V ar(ε|X) not known
If V ar(ε|X) is estimated the BLUE-property of βGLS is lost
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 39
![Page 40: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/40.jpg)
Special case of GLS - weighted least squares
E(εε′|X) = V ar(ε|X) = σ2
V1(X) 0 . . . 0
0 V2(X)...
... 0 . . . 00 . . . 0 VN(X)
= σ2V
As V(X)−1 = C′C
⇒ C =
1√V1(X)
0 . . . 0
0 1√V2(X)
...
... . . . 0
0 . . . 0 1√Vn(X)
=
1s1
0 . . . 0
0 1s2
...... 0 . . . 0
0 . . . 0 1sn
⇒ argmin
∑ni=1
(y1si− β1s−1i − β2
xi2si...− βK xiK
si
)2Observations are weighted by standard deviation
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 40
![Page 41: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/41.jpg)
10. Multicollinearity
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 41
![Page 42: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/42.jpg)
Exact multicollinearity
Expressing a regressor as linear combination of (an)other regressor(s)
rank(X) 6= K: No full rank
⇒ Assumption 1.3 or 2.4 is violated
(X′X)−1 does not exist
Often economic variables are correlated to some degree
BLUE result is not affected
Large sample results are not affected
� relative results
� V ar(b|X) is affected in absolute terms
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 42
![Page 43: 7. Introduction to Large Sample Theory - uni-mannheim.defroelich.vwl.uni-mannheim.de/fileadmin/user_upload/froelich/... · 7. Introduction to Large Sample Theory Hayashi p. 88-97/109-133](https://reader031.vdocuments.site/reader031/viewer/2022022012/5b16a5f87f8b9a546d8d1e1b/html5/thumbnails/43.jpg)
Effects of Multicollinearity and solutions to the problem
Effects:
- Coefficients may have high standard errors and low significance levels
- Estimates may have the wrong sign
- Small changes in the data produces wide swings in the parameterestimates
Solutions:
- Increasing precision by implementing more data. (Costly!)
- Building a better fitting model that leaves less unexplained.
- Excluding some regressors. (Dangerous! Omitted variable bias!)
Advanced Econometrics I, Autumn 2010, Large-Sample Theory 43