lecture 11 - pkumwfy.gsm.pku.edu.cn/miao_files/probstat/lecture11.pdf2 covariance and...
TRANSCRIPT
1
Lecture 11
! Covariance and correlation
! The Sample Mean
2
Covariance and Correlation! Covariance: measure the association between
two random variables.
Let X and Y be random variables having a specifiedjoint distribution, and let E(X)= , E(Y)= .The covariance of X and Y, is defined as
if the expectation exists.
! It can be shown (in Exercise) that if both X and Y havefinite variance, then the expectation will exist.
Xµ Yµ
)])([(),( YX YXEYXCov µµ --=
3
Correlation! If , , the correlation of X and
Y, is defined as
! The range of possible values of the correlation is:
¥<< 2X0 s ¥<< 2
Y0 s
.),(),(YX
YXCovYXss
r =
.1),(1 ££- YXr
4
Theorem (Schwarz inequality)Theorem (Schwarz inequality): For any random variables U and V,
)()()]([ 222 VEUEUVE £
5
! Let , then
! X and Y are positively correlated: X and Y are negatively correlated:X and Y are uncorrelated:
YX YV,XU µµ -=-=
1),(11)],([)()()]([)],([
2
222222
££-Þ£Þ
=£=
YXYXVEUEUVEYXCov YX
rr
ss
0),( >YXr
0),( <YXr0),( =YXr
6
Properties of Covariance and Correlation
Theorem. For any random variables X and Ysuch that and ,
Cov(X,Y)=E(XY)-E(X)E(Y) ¥<2
Xs ¥<2Ys
7
Proof.
)()()()()()()(
)])([(),(
YEXEXYEXEYEXYE
XYXYEYXEYXCov
YXYX
YXYX
YX
-=+--=
+--=--=
µµµµµµµµ
µµ
8
! Theorem. If X and Y are independentrandom variables with and ,then
¥<< 2X0 s ¥<< 2
Y0 s
0),(),( == YXYXCov r
9
Proof. If X and Y are independent, thenE(XY)=E(X)E(Y). Therefore,
Cov(X,Y)=E(XY)-E(X)E(Y)=0.
It follows that.0),( =YXr
10
! Remark: Two uncorrelated random variables can be dependent.
Example:
Suppose that X can take only three values –1, 0,and 1 and each of these three values has the sameprobability. Let Y be defined by .
Please show that X and Y are dependent butuncorrelated.
2XY =
11
Proof. (1) X and Y are dependent.
(2) X and Y are uncorrelated.
0),(0)()(
0)()()( 3
=Þþýü
====
YXCovYEXE
XEXEXYE
2
1
22
1 1
(1,1) Pr( 1 and 1) Pr( 1 and 1) 1 = Pr( 1 and ( 1 or -1))= Pr( 1)= .3
1(1) Pr( 1 ) = .3
2(1) Pr( 1) Pr( 1) = Pr( 1 or -1) = .3
Thus, (1,1) (1) (1), that is, X and Y are
f X Y X X
X X X X
f X
f Y X X X
f f f
= = = = = =
= = = =
= = =
= = = = = =
¹ not independent.
12
! Theorem. Suppose X is a random variablewith , suppose that Y=aX+b where
. If a>0, then . If a<0, then¥<< 20 Xs
0¹a 1),( =YXr.1),( -=YXr
13
Proof. If Y=aX+b, then and. Therefore,
ba XY += µµ
)( XY XaY µµ -=-
)(),(),(
||)][(),(
222
22
asignYXCovYX
aaaXaEYXCov
YX
XYXY
xX
==Þ
ïþ
ïýü
=Þ=
=-=
ssr
ssss
sµ
14
! Theorem. If X and Y are random variables such that and , then
Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y).¥<)(XVar ¥<)(XVar
15
Proof. Since E(X+Y)= , thenYX µµ +
),(2)()()])((2)()[(
)][()(22
2
YXCovYVarXVarYXYXE
YXEYXVar
YXYX
YX
++=--+-+-=
--+=+
µµµµ
µµ
16
! Remark. For any constants a and b, we canshow that Cov(aX,bY)=abCov(X,Y). It followsthat
),(2)()(),(2)()()(
22 YXabCovYVarbXVarabYaXCovbYVaraXVarcbYaXVar
++=
++=++
17
! Theorem. If X1,...,Xn are random variables such that for i=1,...,n, then
Proof. For any random variable Y, Cov(Y,Y)=Var(Y).
¥<)( iXVar
åååå<==
+=÷ø
öçè
æ
jiji
n
ii
n
ii XXCovXVarXVar ),(2)(
11
ååå
ååå
ååååå
<=
¹=
= ====
+=
+=
=÷ø
öçè
æ=÷
ø
öçè
æ
jiji
n
ii
jiji
jiii
n
i
n
jji
n
ii
n
ii
n
ii
XXCovXVar
XXCovXXCov
XXCovXXCovXVar
),(2)(
),(),(
),(,
1
1 1111
18
! Remark. If X1,...,Xn are uncorrelated random variables, then
åå==
=÷ø
öçè
æ n
ii
n
ii XVarXVar
11)(
19
Markov Inequality! Theorem. Markov Inequality. Suppose that X is a
random variable such that . Then for anygiven number t>0,
1)0Pr( =³X
.)()Pr(tXEtX £³
20
Markov Inequality
Proof. Assume for convenience that X has a discretedistribution with p.f. f. Then,
tXEtX
tXtxtfxxfXE
xxfxxfxxfXE
txtx
txtxx
)()Pr(
)Pr()()()(
)()()()(
£³Þ
³=³³Þ
+==
åå
ååå
³³
³<
21
! Remark. The Markov inequality is primarilyof interest for large values of t. For example,for any nonnegative random variable Xwhose mean is 1, the maximum possiblevalue of is 0.01.)100Pr( ³X
22
! Theorem : Chebyshev Inequality. Let X be a random variable for which Var(X) exisits. Then for any given number t>0,
Proof. Let . Then and E(Y)=Var(X). By applying the Markov inequality to Y, we have
2)()|)(Pr(|
tXVartXEX £³-
2)]([ XEXY -= 1)0Pr( =³Y
222 )()()Pr()|)(Pr(|
tXVar
tYEtYtXEX =£³=³-
23
! Remark. Suppose Var(X)=
91
)3()3|Pr(|
41
)2()2|Pr(|
2
2
2
2
=£³-
=£³-
sss
sss
EXX
EXX
2s
24
Properties of the Sample Mean
! Suppose that X1,...,Xn form a random sampleof size n from some distribution for which themean is and the variance is . Let
This random variable is called the samplemean.
! Question:
2sµ
11 ( )n nX X Xn
= + +L
?)|Pr(| ?)( ?)( £³-== tXXVarXE nnn µ
25
2
2
22
21
2
12
1
)|Pr(|
1)(1
1)(
1)(1)(
nttX
nn
nXVar
n
XVarn
XVar
nn
XEn
XE
n
n
ii
n
iin
n
iin
sµ
ss
µµ
£³-
=×==
÷ø
öçè
æ=
=×==
å
å
å
=
=
=
26
Example! Suppose a random sample is to be taken from
a distribution for which the mean isunknown, but the standard deviation isknown to be 2. How large a sample must betaken in order to make the probability at least0.99 that will be less than 1 units.|X| n µ-
40001.04thatfollowsit
,99.0)1|Pr(|fororderIn
4)1|Pr(|2
³Û£
³<-
=£³-
nn
Xnn
X
n
n
µ
sµ
27
Example! Suppose a fair coin is to be tossed n times. For
i=1,...,n, let Xi=1 if a head is obtained on the ith tossand let Xi=0 if a tail is obtained on the ith toss. Thenthe sample mean is the proportion of heads thatare obtained on the n tosses. What is the number oftimes the coin must be tossed in order to make
Solution: Let denote the total number ofheads obtained on the n tosses, then . T has abinomial distribution with parameters n and p=1/2.So E(T)=n/2, Var(T)=n/4.
nX
Pr(0.4 0.6) 0.70nX£ £ ³
å ==
n
1i iXTn/TX n =
28
1. Use Chebyshev inequality:
2
Pr(0.4 0.6) Pr(0.4 0.6 )
Pr(| | 0.1 ) 1 Pr(| | 0.1 )2 2
251 14(0.1 )
25When 84,1 0.70.
nX n T nn nT n T n
nn n
nn
£ £ = £ £
= - £ = - - >
³ - = -
³ - ³
2. Use binomial distribution: for n=?,
So ? tosses is sufficient.Pr(0.4 0.6) Pr(0.4n 0.6 ) 0.70X T n£ £ = £ £ >
29
下载R
30
31
32
33
R codefor (n in 1:20){start = ceiling(0.4*n)end = floor(0.6*n)prob = sum(dbinom(c(start:end), n, 0.5))print(c(n, start, end, prob))}
34