Download - 283 Chi2 f13-Handout
-
8/13/2019 283 Chi2 f13-Handout
1/40
Chi-squared (2
) (1.10.5) and F-tests (9.5.2) for thevariance of a normal distribution
2 tests for goodness of fit and indepdendence
(3.5.43.5.5)
Prof. Tesler
Math 283
Nov. 13, 2013
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 1 / 40
-
8/13/2019 283 Chi2 f13-Handout
2/40
Tests of means vs. tests of variances
Datax1, . . . ,xn, sample mean x, sample var.sX2
Datay1, . . . ,ym, sample mean y, sample var.sY2
Tests for mean
One-sample tests:
H0 :=0 vs.H1 :0
test statistic:
z= x0/
n
ort= x0s/
n
(df= n1)
Tests for variance
One-sample test:
H0 :2 =
0
2 vs.H1 :2
0
2
test statistic: chi-squared
2 = (n 1)s2/0
2 (df= n1)
Two-sample tests:
H0 :X
= Y
vs. H1 :X
Y
test statistic:
z= xy
X2
n +
Y
2
m
ort= xy
sp1
n+ 1
m
(df =n+m 2)
Two-sample test:
H0 :X
2 =Y
2 vs.H1 :X
2
Y
2
test statistic:
F=sY
2/sX
2
(withm 1andn 1 d.f.)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 2 / 40
-
8/13/2019 283 Chi2 f13-Handout
3/40
Application: The fine print in the Zandt-tests
One-sample z-test, H0: =0 vs.H1: 0
This assumes that you know the value of2
, say2
=02
.A2 test could be used to verify that the data is consistent withH0 :
2 =02 instead ofH1 :
20
2.
Two-sample z-test, H0: X= Yvs. H1: XY
This assumes that you know the values of X2
andY2
.Separate2 tests forX
2 andY
2 could be performed to verifyconsistency with the assumed values.
Two-sample t-test, H0: X= Yvs. H1: XY
This assumesX2
=Y2
(but doesnt assume that this commonvalue is known to you).AnF-test could be used to verify that the data is consistent withH0 :X
2 =Y
2 instead ofH1 :X2
Y2.
If the variances are unequal, Welchs t-test can be used instead ofthe regular two-samplet-test (Ewens & Grant pp. 127128).
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 3 / 40
-
8/13/2019 283 Chi2 f13-Handout
4/40
The2 (Chi-squared) distribution
Used for confidence intervals and hypothesis tests on the
unknown parameter2 of the normal distribution, based on the
test statistics2
(sample variance).
It has the same degrees of freedom as for the tdistribution.
Point these out on the graphs:The chi-squared distribution withkdegrees of freedom has
Range [0,)Mean =k
Mode 2 =k 2(fork 2, the pdf is maximum for2 =k 2)
2 = 1(fork= 1)
Median
k(1 2
9k)3
Between kandk 23 .Asymptotically decreases k 2
3 ask.
Variance 2 = 2k
PDF x(k/2)1ex/2
2k/2(k/2) : distrib. with shaper= k
2, rate= 1
2
Unlike zandt, the pdf for2
is NOT symmetric.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 4 / 40
-
8/13/2019 283 Chi2 f13-Handout
5/40
The graphs for 1and 2degrees of freedom are decreasing:
0 1 2 3 40
1
2
3
4
5
!2
1
pd
f
0 1 2 3 4 5 60
0.1
0.2
0.3
0.4
0.5
!2
2
pdf
mean @ = 1 mean @ = 2
mode @ 2
= 0 mode @
2= 0
median @ 2 = chi2inv(.5,1) = qchisq(.5,1) = 0.4549 median @ 2 =chi2inv(.5,2) =qchisq(.5,2) = 1.3863
The rest are hump shaped and skewed to the right:
0 2 4 6 80
0.05
0.1
0.15
0.2
0.25
!2
3
pdf
0 2 4 6 8 10 12 14 160
0.02
0.04
0.06
0.08
0.1
0.12
!2
8
pdf
mean @ = 3 mean @ = 8
mode @ 2
=
1 mode @ 2
=
6median @ 2 =chi2inv(.5,3) =qchisq(.5,3) = 2.3660 median @ 2 =chi2inv(.5,8) =qchisq(.5,8) = 7.3441
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 5 / 40
-
8/13/2019 283 Chi2 f13-Handout
6/40
2 (Chi-squared) distribution Cutoffs
0 2 4 6 8 10 12
0.
00
0.
05
0.
10
0.
15
!!2
5
pdf
!!2"",df
left!sided critical region
0 5 10 15
0.
00
0.
05
0.
10
0.
15
!!2
5
pdf
!!2
0.025,5 == 0.8312116
!!20.975,5 == 12.83250
2!sided acceptance region: df=5, "" == 0.05
Define2,dfas the number where the cdf (area leftof it) is: P(2
df 2
,df) =
Different notation thanz andt,df (areaon right) since pdf isnt symmetric.
Matlab R
20.025,5 = chi2inv(.025,5) = qchisq(.025,5) = 0.83122
0.975,5 = chi2inv(.975,5) = qchisq(.975,5) = 12.8325
chi2cdf(0.8312,5) = pchisq(0.8312,5) = 0.025
chi2cdf(12.8325,5)= pchisq(12.8325,5)= 0.975
chi2pdf(0.8312,5) = dchisq(0.8312,5) = 0.0665
chi2pdf(12.8325,5)= dchisq(12.8325,5)= 0
.0100
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 6 / 40
-
8/13/2019 283 Chi2 f13-Handout
7/40
Two-sided cutoff
0 5 10 15
0.
00
0.
05
0.
10
0.
15
!!25
pdf
!!2
0.025,5 == 0.8312116
!!20.975,5 == 12.83250
2!sided acceptance region: df=5, "" == 0.05
The mean, median, and mode are different, so it may not be
obvious what values of 2 are more consistent with
the nullH0: 2
= 10000vs. the alternative2
10000.Closer to the median of 2 is "more consistent" with H0.
For 2-sided hypothesis tests or confidence intervals with = 5%,
we still put 95%of the area in the middle and 2.5%at each end,
but the lower and upper cutoffs are determined separately instead
of each other.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 7 / 40
-
8/13/2019 283 Chi2 f13-Handout
8/40
Two-sided hypothesis test for varianceTest H0 :
2 = 10000 vs. H1: 2 10000 at sig. level = .05
(In general, replace 10000 by02; here,0 = 100)
Decision procedure
1 Get a samplex1, . . . ,xn.
650, 510, 470, 570, 410, 370 with n= 6
2 Calculate m= x1++xn
n
and s2 = 1
n1
ni=1(xi m)
2.
m= 496.67, s2 = 10666.67, s= 103.28
3 Calculate the test-statistic2 = (n1)s2
0 2 =n
i=1(xim)
2
0 2
2 = (n1)s2
0 2 =
(61)(10666.67)
10000 = 5.33
4 AcceptH0 if2
is between2
/2,n1 and2
1/2,n1.RejectH0 otherwise.
2.025,5 =.8312, 2.975,5 = 12.8325
Since2 = 5.33is between these, we accept H0.
(Or, there is insufficient evidence to reject2 = 10000.)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 8 / 40
-
8/13/2019 283 Chi2 f13-Handout
9/40
P-value
We can do the same test with a P-value:
P(25 5.33) = 0.6231is the area left of 5.33 on the2 curve with
5 degrees of freedom.
Matlab: chi2cdf(5.33,5) R: pchisq(5.33,5)
Values at least as extreme as this (compared to the median) are
those at the 62.31th percentile or higher, OR at the 37.69th
percentile or lower:
P= 2 min(0.6231, 1 0.6231) = 2(0.3769) = 0.7539
P> (0.75> 0.05) so acceptH0.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 9 / 40
-
8/13/2019 283 Chi2 f13-Handout
10/40
Two-sided 95%confidence interval for the variance
Continue with data 650, 510, 470, 570, 410, 370
which has n= 6, m= 496.67, s2 = 10666.67, s= 103.28.
Get bounds on2 in terms ofs2 for the two-sided test:
0.95 = P(2.025,5 < 2 < 2.975,5)
= P(0.8312< 2 < 12.8325)
= P
0.8312< (61)S2
2 < 12.8325
= P(61)S2
0.8312 > 2 >
(61)S2
12.8325
A two-sided 95%confidence interval for the variance 2 is(
6
1)S
2
12.8325 , (6
1)S
2
0.8312
= (4156.11, 64164.26)
A two-sided 95%confidence interval for is(61)S2
12.8325,
(61)S2
0.8312
= (64.47, 253.31)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 10 / 40
-
8/13/2019 283 Chi2 f13-Handout
11/40
Properties of Chi-squared distribution
1 Definition of Chi-squared distribution:
LetZ1, . . . ,Zkbe independent standard normal variables.Let2k=Z1
2 + +Zk2.The pdf of the random variable2k is the chi-squared distribution
withkdegrees of freedom.
2 Pooling property: IfUandVare independent2 random
variables withqandrdegrees of freedom respectively, thenU+V
is a2 random variable withq+rdegrees of freedom.
3
Sample variance: PickX1, . . . ,X
n from a normal distributionN(, 2). It turns out thatn
i=1
(XiX)2
2 =
SS
2 =
(n 1)S2
2
has a2 distribution withdf =n 1, so we test on 2 = (n1)s2
02 .
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 11 / 40
-
8/13/2019 283 Chi2 f13-Handout
12/40
Fdistribution
LetUandVbe independent
2
random variables withqandrdegreesof freedom, respectively. Then the random variable
F=Fq,r= U/q
V/r
is called the Fdistribution withqandrdegrees of freedom.
Range [0,)Mean r
r2 ifr> 2
Mode r(q2)
q(r+2) ifq> 2
Median 1 ifq=r
> 1ifq>r
< 1ifq 4
PDF messy
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 12 / 40
-
8/13/2019 283 Chi2 f13-Handout
13/40
Fdistribution
0 1 2 30
0.5
1
1.5
2
2.5
0 1 2 30
0.2
0.4
0.6
0.8
1
0 1 2 30
0.2
0.4
0.6
0.8
1
0 1 2 30
0.5
1
1.5
2
F1,1
F1,2
F1,5F
1,10
F1,30
F1,100
F1,!
F3,1
F3,2
F3,5F
3,10
F3,30
F3,100
F3,!
F10,1
F10,2
F10,5
F10,10
F10,30
F10,100
F10,!
F30,1
F30,2
F30,5
F30,10
F30,30
F30,100
F30,!
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 13 / 40
-
8/13/2019 283 Chi2 f13-Handout
14/40
FdistributionDefineF,q,ras the number where P(F F,q,r) =(left-hand area)
! " # $ % & ' ( )!
!*"
!*#
!*$
!*%
!*&
!*'
+!*!#&,(,&
-!*"). +
!*.(&,(,&-'*)&$
#!/0121 4552674852 92:0;8
-
8/13/2019 283 Chi2 f13-Handout
15/40
Two-sided hypothesis test for F
Test at significance level = 5%:
H0 :X2 =Y2 vs. H1 :X2 Y2
Data for X: 650, 510, 470, 570, 410, 370
x= 496.67, sX
2 = 10666.67, sX
= 103.28, df = 6 1 = 5
Data for Y: 510, 420, 520, 360, 470, 530, 550, 490
y= 481.25, sY
2 = 4012.5, sY
= 63.3443, df = 8 1= 7
Test statistic: F=F7,5 =sY2/sX2 = 4012.510666.67 = 0.3762.
Since our test statistic 0.3762lies between the cutoffs
F.025,7,5 = 0.1892and F.975,7,5 = 6.8531, we acceptH0 / reject H1.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 15 / 40
-
8/13/2019 283 Chi2 f13-Handout
16/40
P-values
The CDF is P(F7,5 0.3762) = 0.1174
Matlab: fcdf(.3762,7,5)
R: pf(.3762,7,5)
To make it two sided,
P= 2 min(0.1174, 1 0.1174) = 2(0.1174) = 0.2348
SinceP> (0.2348> 0.05), we accept the null hypothesis.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 16 / 40
-
8/13/2019 283 Chi2 f13-Handout
17/40
Fstatistic to compare variances (two-sample data)Theoretical setup
First sample: x1, . . . ,xn
V=(n 1)s
X2
X
2 =
ni=1
(xi x)2
X
2 withn 1 d.f.
Second sample: y1, . . . ,ym
U=
(m 1)sY
2
Y
2 =
m
j=1
(yiy)2
Y
2 withm 1d.f.
Assuming the null hypothesisX
2 =Y
2, the variances cancel:
F=
U/(m 1)
V/(n 1) =
sY
2
sX2 =m
i=1(yiy)2
/(m 1)
nj=1(xix)
2
/(n 1)
withm 1andn 1 degrees of freedom.
For a null hypothesisX
2 =CY
2 whereC> 0is constant, use
F=CsY2
/sX2
instead.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 17 / 40
-
8/13/2019 283 Chi2 f13-Handout
18/40
k-sample experiments
ANOVA(Analysis of Variance) is a procedure to compare the
meansofk-sample data (analagous to two-sample data, but for k
independent sets of data).
It involves theFdistribution with a formula for Fthat takes into
account allksamples instead of just two samples.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 18 / 40
-
8/13/2019 283 Chi2 f13-Handout
19/40
2 tests for goodness of fit and
independence (3.5.43.5.5)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 19 / 40
-
8/13/2019 283 Chi2 f13-Handout
20/40
Multinomial test
Consider ak-sided die with faces 1, 2, . . . , k.
We want to simultaneously test that the probabilities p1,p2, . . . ,pkof rolling 1, 2, . . . , kare specified values.
To test if a 6-sided die is fair,
H0: (p1, . . . ,p6) = (1/6, . . . , 1/6)
H1: At least one pi 1/6
Decision rule is based counting # 1s, 2s, etc. onnindependent
rolls of the die.
For the fair coin problem, the exact distribution was binomial, andwe approximated it with a normal distribution.
For this problem, the exact distribution is multinomial.
We will combine the separate counts of 1, 2, . . .into a single test
statistic whose distribution is approximately a 2 distribution.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 20 / 40
-
8/13/2019 283 Chi2 f13-Handout
21/40
Goodness of fit tests for Mendels experiments
In Mendels pea plant experiments,
yellow seeds (Y) are dominant and green (y) recessive;round seeds (R) are dominant and wrinkled (r) are recessive.
Consider the phenotypes of the offspring in a dihybrid cross
YyRr YyRr:
Expected ObservedType fraction number
yellow & round 9/16 315yellow & wrinkled 3/16 101green & round 3/16 108
green & wrinkled 1/16 32Total: n=556
Hypothesis test:
H0: (p1,p2,p3,p4) = ( 9
16, 3
16, 3
16, 1
16)
H1: At least one pi disagreesProf. Tesler 2 andFtests Math 283 / Nov. 13, 2013 21 / 40
-
8/13/2019 283 Chi2 f13-Handout
22/40
Does the data fit the expected distribution?
Expected Observed
Type fraction numberyellow & round 9/16 315yellow & wrinkled 3/16 101green & round 3/16 108green & wrinkled 1/16 32
Total: n=556
The observed number of yellow & round plants is O= 315.
(Dont confuse the letter Owith the number 0.)
The expected number isE= (9/16) 556= 312.75.
The goodness of fit test requires that we convert all the expected
proportions into expected numbers.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 22 / 40
-
8/13/2019 283 Chi2 f13-Handout
23/40
Goodness of fit test
Observed number Expected number
Type O E OE (OE)2/E
yellow & round 315 (9/16)556= 312.75 2.25 0.0161871yellow & wrinkled 101 (3/16)556= 104.25 3.25 0.1013189green & round 108 (3/16)556= 104.25 3.75 0.1348921green & wrinkled 32 (1/16)556= 34.75 2.75 0.2176259
Total 556 556 0 0.4700240
k= 4categories givek 1 = 3degrees of freedom.
(TheOandEcolumns both total 556, so the OEcolumn totals
0; thus, any 3 of the(OE)s dictate the fourth.)
The test statistic is the total of the last column, 23
= 0.4700240.
The general formula is2k1 =k
i=1
(OiEi)2
Ei.
Warning: Technically, that formula only has an approximate
chi-squared distribution. When E 5in all categories, the
approximation is pretty good.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 23 / 40
-
8/13/2019 283 Chi2 f13-Handout
24/40
Goodness of fit testSmaller values of2 indicate better agreement between theOand
Evalues (so supportH0 better). Larger values supportH1 better.
Its a one-sided test.
!!
!
!
!
!
!
!
!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
! !!
! !!
! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!
!
0 5 10 15
0.
00
0.
05
0.
10
0.
15
0.
20
0.
25
Supports H0better
Supports H1better
Observed !2
!2
pdf
TheP-value is the probability, under H0, of a test statistic that
supports H1 as well as or better than the observed value:
P= P(23 0.4700240) =.9254259
Matlab:1-chi2cdf(.4700240,3)
R:1-pchisq(.4700240,3)Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 24 / 40
-
8/13/2019 283 Chi2 f13-Handout
25/40
Goodness of fit test
P=.9254259is not too extreme.
It means that if H0 is true and the experiment is repeated a lot,
about 7.5%of the time, a 23
value supportingH0 better (lower
values of23
) will be obtained, and about 92.5%of the time, values
supporting H1 better (higher values of 2
3) will be obtained.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 25 / 40
-
8/13/2019 283 Chi2 f13-Handout
26/40
Ronald Fisher (18901962)
He made important contributions to both statistics and genetics.
Connection: he invented statistical methods while working on
genetics problems.
Our way of using the normal, Student t,2, andFdistributions in
the same framework, plus ANOVA, is due to him.
In genetics, he reconciled continuous variations (heights andweights) with Mendelian genetics (discrete traits), and developed
much of population genetics.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 26 / 40
-
8/13/2019 283 Chi2 f13-Handout
27/40
Did Mendel fudge his data?
Forindependent experiments,the values of2 may be pooled by
adding the2
values and adding the degrees of freedom.
Fisher pooled the data from Mendels experiments and got
2 = 41.6056with 84 degrees of freedom.
Assuming Mendels laws are true, how often would we get 284
supporting H0/H1 better than this?
SupportH0 better:
P(284 41.6056) = 0.00002873
SupportH1 better:
P-valueP=P(2
84 41.6056) = 1 0.00002873=.99997127.
So if Mendels laws hold and 1 million researchers independently
conducted the same experiments as Mendel, about 29 of them
would get data with as little or even less variation than Mendel had.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 27 / 40
-
8/13/2019 283 Chi2 f13-Handout
28/40
Did Mendel fudge his data?
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!
!!
!!
!!
!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!
!!!
!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!
!!!
!!!!
!!
!!!
!!
!!
!!!
!!
!!
!!
!!
!!
!!!
!!
!!
!!!
!!
!
!!
!!!
!!!
!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
0 50 100 150
0.
00
0
0.
010
0.
020
0.
030
Supports H0better
Prob. =0.00002873
Supports H1better
Prob. =0.99997127
!84
2=41.6056
!84
2
pd
f
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 28 / 40
-
8/13/2019 283 Chi2 f13-Handout
29/40
Did Mendel fudge his data?
Based on this and similar tests, Fisher believed that something was
fishy with Mendels data:
The values are too good in the sense that they are too close to
what was expected.
At the same time, they are bad in the sense that there is too little
random variation.
Some people have accused Mendel of faking data.
Others speculate that he only reported his best data.
Other people defend Mendel by speculating on biologicalexplanations for why his results would be better than expected.
All pro and con arguments have later been rebutted by someone
else.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 29 / 40
-
8/13/2019 283 Chi2 f13-Handout
30/40
Tests of independence (contingency tables)
A study in 1899 examined 6800 German men to see if hair color and
eye color are related.
Observed countsO: Hair color
Brown Black Fair Red Total
Eye Brown 438 288 115 16 857
Color Gray/Green 1387 746 946 53 3132
Blue 807 189 1768 47 2811Total 2632 1223 2829 116 6800
Hypothesis test (at= 0.05)
H0: eye color and hair color are independent, vs.
H1: eye color and hair color are correlated
Meaning of independence
For all eye colors xand all hair colors y:
P(eye color=xand hair color=y) =P(eye color=x)
P(hair color=y)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 30 / 40
-
8/13/2019 283 Chi2 f13-Handout
31/40
ComputingEtable
Hypothesis test (at= 0.05)
H0: eye color and hair color are independent, vs.H1: eye color and hair color are correlated
The fraction of people with red hair is 116/6800.
The fraction with blue eyes is 2811/6800.
Use these as point estimates: P(hair color=red) 116/6800andP(eye color=blue) 2811/6800.Under the null hypothesis, the fraction with red hair and blue eyes
would be (116 2811)/68002.The expected number of people with red hair and blue eyes is
6800(116 2811)/68002 = (116 2811)/6800= 47.95.(Row total times column total divided by grand total.)
ComputeEthis way for all combinations of hair and eye color.
As long asE 5in every cell (here it is) and the data is normally
distributed (an assumption), the 2
test is valid.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 31 / 40
-
8/13/2019 283 Chi2 f13-Handout
32/40
ComputingEandOEtables
Expected countsE: Hair color
Brown Black Fair Red
Eye Brown 331.71 154.13 356.54 14.62
Color Gray/Green 1212.27 563.30 1303.00 53.43
Blue 1088.02 505.57 1169.46 47.95
In each position, computeOE.
For red hair and blue eyes, this is OE= 47 47.95= .95:
OE: Hair color
Brown Black Fair Red
Eye Brown 106.29 133.87 241.54 1.38
Color Gray/Green 174.73 182.70 357.00 0.43Blue 281.02 316.57 598.54 0.95
Note all the row and column sums in the OEtable are 0, so if we hid
the last row and column, we could deduce what they are. Thus, this
3 4table has(3 1)(4 1) = 6degrees of freedom.Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 32 / 40
-
8/13/2019 283 Chi2 f13-Handout
33/40
Computing test statistic2
Compute(OE)2/Ein each position.
For red hair and blue eyes, this is (.95)2
/47.95= 0.0189.(You could go directly to this computation after the Ecomputation,
without doingOEfirst.)
(OE)2/E: Hair color
Brown Black Fair RedEye Brown 34.0590 116.2632 163.6301 0.1304
Color Gray/Green 25.1852 59.2571 97.8139 0.0034
Blue 72.5845 198.2220 306.3398 0.0189
Add all twelve of these to get
2 = 34.0590+ +0.0189= 1073.5076
There are 6 degrees of freedom, so 26
= 1073.5076.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 33 / 40
-
8/13/2019 283 Chi2 f13-Handout
34/40
Performing the test of independence
2 would be 0 if the traits were truly independent.
Smaller values supportH0 better (traits independent).
Larger values supportH1 better (traits correlated).Its a one-sided test.
At the 0.05level of significance, we reject H0 if
26 2
0.95,6 = 12.5916
Indeed, 1073.5076> 12.5916so we reject H0 and conclude thathair color and eye color are linked in this data.
This doesnt prove that a particular hair color causes one to have
a particular eye color, or vice-versa; it just says theres a
correlation in this data.Using P-values: P=P(2
6 1073.5076) 1.1 10228
soP = 0.05and we rejectH0.Matlab: cant compute this (givesP= 0).
R: pchisq(1073.5076,6,lower.tail=FALSE)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 34 / 40
-
8/13/2019 283 Chi2 f13-Handout
35/40
Performing the test of independence
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!! ! ! ! !
!!
!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!!
!! !
!! !
! ! ! ! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !!!
0 5 10 15 20 25 30
0
.00
0.
06
0.
12
!6
2
pdf
!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!!!!!!!!!!!!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
0 200 400 600 800 1000 1200 1400
0.
00
0.
06
0.
12
Supports H0better
Prob. = 1 !"
Supports H1better
Prob. = "= 1.1e!228
!6
2= 1073.508
!6
2
pdf
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 35 / 40
-
8/13/2019 283 Chi2 f13-Handout
36/40
Mendels pea plants revisited: Are lociYandRlinked?
We will use the same data as in the goodness-of-fit test but for a
different purpose. Consider the phenotypes of the offspring in adihybrid crossYyRr YyRr:
Observed countsO: Seed Shape
Round (R) Wrinkled (r) Total
Seed Yellow (Y) 315 101 416
Color Green (y) 108 32 140
Total 423 133 556
Hypothesis test (at= 0.05)
H0: Seed color and seed shape are independent, vs.
H1: Seed color and seed shape are correlated
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 36 / 40
-
8/13/2019 283 Chi2 f13-Handout
37/40
Mendels pea plants revisited: Are lociYandRlinked?
Seed Color Seed ShapeRound (R) Wrinkled (r) Total
O: Yellow (Y) 315 101 416(Observed #) Green (y) 108 32 140
Total 423 133 556
E: Yellow (Y) 316.4892 99.5108 416
(Expected #) Green (y) 106.5108 33.4892 140Total 423 133 556
OE: Yellow (Y) -1.4892 1.4982 0(Deviation) Green (y) 1.4892 -1.4892 0
Total 0 0 0
(OE)2/E: Yellow (Y) 0.0070 0.0223 0.0293(2 contrib.) Green (y) 0.0208 0.0662 0.0870
Total 0.0278 0.0885 0.1163
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 37 / 40
?
-
8/13/2019 283 Chi2 f13-Handout
38/40
Mendels pea plants revisited: Are lociYandRlinked?
Using 2 as the test statistic:
df = (2 1)(2 1) = 1
21
=.0070+.0223+.0208+.0662= 0.1163
cutoff: 20.95,1 = chi2inv(.95,1)= qchisq(.95,1)= 3.8415
0.1163< 3.8415so its not significant
Using P-values:
P=P(21
> 0.1163) = 1-chi2cdf(0.1163,1)= 0.7331
1-pchisq(0.1163,1)
P> 0.05so AcceptH0 (genes not linked)
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 38 / 40
C i f h
-
8/13/2019 283 Chi2 f13-Handout
39/40
Comparison of the two tests
At fertilization, if genesRandYare not linked, then in an
RrYyRrYycross, the expected proportions areRY:Ry : rY: ry= 1 : 1 : 1 : 1.
If linked, it would be different.
Some genotypes may not survive to the points at which the
phenotype counts are made; e.g., hypothetically, 40%of
individuals withRrmight not be born, might die before reproducing
(affecting multigenerational experiments), etc.This would change the ratio of
RR :Rr: rr from 1 : 2 : 1 to 1 : 1.2 : 1= 5 : 6 : 5,
and round:wrinkled from 3 : 1to 2.2 : 1= 11 : 5.
Prof. Tesler 2 andFtests Math 283 / Nov. 13, 2013 39 / 40
C i f h
-
8/13/2019 283 Chi2 f13-Handout
40/40
Comparison of the two tests
The goodness-of-fit test assumed all genotypes are equally viable.
Whether the genes are linked or not should be a separate matter.
If you know the yellow:green and round:wrinkled viability ratios,you can use the goodness-of-fit test on 4 phenotypes with 3
degrees of freedom by adjusting the proportions.
If you dont know these viability ratios, you can estimate the ratios
from data via contingency tables, at the cost of dropping to 1degree of freedom.