lecture 2 the analysis of cross-tabulations. cross-tabulations tables of countable entities or...
Post on 21-Dec-2015
222 views
TRANSCRIPT
Lecture 2Lecture 2
The analysis of cross-tabulations
Cross-tabulationsCross-tabulations
• Tables of countable entities or frequencies • Made to analyze the association,
relationship, or connection between two variables
• This association is difficult to describe statistically
• Null- Hypothesis: “There is no association between the two variables” can be tested
• Analysis of cross-tabulations with larges samples
Delivery and housing tenureDelivery and housing tenure
Housing tenure Preterm Term Total
Owner-occupier 50 849 899
Council tentant 29 229 258
Private tentant 11 164 175
Lives with parents 6 66 72
Other 3 36 39
Total 99 1344 1443
Delivery and housing tenureDelivery and housing tenure
• Expected number without any association between delivery and housing tenure
Housing tenure Pre Term Total
Owner-occupier 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• 899/1443 = 62.3% are house owners.• 62.3% of the Pre-terms should be house owners:
99*899/1443 = 61.7
Housing tenure Pre Term Total
Owner-occupier 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• 899/1443 = 62.3% are house owners.• 62.3% of the ‘Term’s should be house owners:
1344*899/1443 = 837.3
Housing tenure Pre Term Total
Owner-occupier 61.7 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• 258/1443 = 17.9% are council tenant.• 17.9% of the ‘preterm’s should be council tenant:
99*258/1443 = 17.7
Housing tenure Pre Term Total
Owner-occupier 61.7 837.3 899
Council tenant 258
Private tenant 175
Lives with parents 72
Other 39
Total 99 1344 1443
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• In general
Housing tenure Pre Term Total
Owner-occupier 61.7 837.3 899
Council tenant 17.7 240.3 258
Private tenant 12.0 163.0 175
Lives with parents 4.9 67.1 72
Other 2.7 36.3 39
Total 99 1344 1443
row total * column total
grand total
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• In general
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
row total * column total
grand total
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
2
all_cells
10.5O E
E
Delivery and housing tenureDelivery and housing tenuretest for associationtest for association
• If the numbers are large this will be chi-square distributed.
• The degree of freedom is (r-1)(c-1) = 4• From Table 13.3 there is a 1 - 5%
probability that delivery and housing tenure is not associated
2
all_cells
10.5O E
E
Chi Squared TableChi Squared Table
Delivery and housing tenureDelivery and housing tenureIf the null-hypothesis is trueIf the null-hypothesis is true
• It is difficult to say anything about the nature of the association.
Housing tenure Pre Term Total
Owner-occupier 50(61.7) 849(837.3) 899
Council tenant 29(17.7) 229(240.3) 258
Private tenant 11(12.0) 164(163.0) 175
Lives with parents 6(4.9) 66(67.1) 72
Other 3(2.7) 36(36.3) 39
Total 99 1344 1443
2 by 2 tables2 by 2 tables
Bronchitis No bronchitis Total
Cough 26 44 70
No Cough 247 1002 1249
Total 273 1046 1319
2 by 2 tables2 by 2 tables
Bronchitis No bronchitis Total
Cough 26 (14.49) 44 (55.51) 70
No Cough 247 (258.51) 1002 (990.49) 1249
Total 273 1046 1319
2
all_cells
12.2O E
E
Chi Squared TableChi Squared Table
Chi-squared test for small samplesChi-squared test for small samples
• Expected valued– > 80% >5– All >1
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration 2 (4.2) 7 (4.8) 9
Death 0 (2.3) 5 (2.7) 5
Total 15 17 32
Chi-squared test for small samplesChi-squared test for small samples
• Expected valued– > 80% >5– All >1
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration and death
2 (6.6) 12 (7.4) 14
Total 15 17 32
2
all_cells
10.8O E
E
Fisher’s exact testFisher’s exact test
• An example
S D T
A 3 1 4
B 2 2 4
5 3 8
S D T
A 4 0 4
B 1 3 4
5 3 8
S D T
A 1 3 4
B 4 0 4
5 3 8
S D T
A 2 2 4
B 3 1 4
5 3 8
Fisher’s exact testFisher’s exact test
• Survivers: – a, b, c, d, e
• Deaths: – f, g, h
• Table 1 can be made in 5 ways
• Table 2: 30• Table 3: 30• Table 4: 5• 70 ways in total
S D T
A 3 1 4
B 2 2 4
5 3 8
S D T
A 4 0 4
B 1 3 4
5 3 8
S D T
A 1 3 4
B 4 0 4
5 3 8
S D T
A 2 2 4
B 3 1 4
5 3 8
Fisher’s exact testFisher’s exact test
• Survivers: – a, b, c, d, e
• Deaths: – f, g, h
• Table 1 can be made in 5 ways
• Table 2: 30
• Table 3: 30
• Table 4: 5
• 70 ways in total
5 30 1
70 70 2
• The properties of finding table 2 or a more extreme is:
Fisher’s exact testFisher’s exact test
S D T
A f11 f12 r1
B f21 f22 r2
c1 c2 n
S D T
A 3 1 4
B 2 2 4
5 3 8
1 2 1 2
11 12 21 22
! ! ! !
! ! ! ! !
4!4!5!3!0.4286
8!3!1!2!2!
r r c cp
n f f f f
S D T
A f11 f12 r1
B f21 f22 r2
c1 c2 n
S D T
A 4 0 4
B 1 3 4
5 3 8
1 2 1 2
11 12 21 22
! ! ! !
! ! ! ! !
4!4!5!3!0.0714
8!4!0!1!3!
r r c cp
n f f f f
Yates’ correction for 2x2 Yates’ correction for 2x2
• Yates correction: 212
all_cells
O E
E
Streptomycin Control Total
Improvement 13 (8.4) 5 (9.6) 18
Deterioration and death
2 (6.6) 12 (7.4) 14
Total 15 17 32
212
all_cells
8.6O E
E
2
all_cells
10.8O E
E
Chi Squared TableChi Squared Table
Yates’ correction for 2x2 Yates’ correction for 2x2
• Table 13.7– Fisher: p = 0.001455384362148– ‘Two-sided’p = 0.0029– χ2: p = 0.001121814118023– Yates’ p = 0.0037
Odds and odds ratiosOdds and odds ratios
• Odds, p is the probability of an event
• Log odds / logit
1
po
p
ln( ) ln1
po
p
OddsOdds
• The probability of coughs in kids with history of bronchitis.p = 26/273 = 0.095o = 26/247 = 0.105The probability of coughs in kids with
history without bronchitis.p = 44/1046 = 0.042o = 44/1002 = 0.044
Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
1
po
p
Odds ratioOdds ratio
• The odds ratio; the ratio of odds for experiencing coughs in kids with and kids without a history of bronchitis.
Bronchitis No bronchitis Total
Cough 26; 0.105 (a) 44; 0.0439 (b) 70
No Cough 247; 9.50 (c) 1002; 22.8 (d) 1249
Total 273 1046 1319
ac
bd
ador
bc
ab
cd
ador
bc
26247
441002
26*10022.40
247*44or
Is the odds ratio different form 1?Is the odds ratio different form 1?
Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
1 1 1 1 1 1 1 126 44 247 1002SE ln 0.257a b c dor
0.874 1.96 0.257 _ to_0.874 1.96 0.257 0.37 _ _1.38to
ln( ) ln(2.40) 0.874or
• We could take ln to the odds ratio. Is ln(or) different from zero?
• 95% confidence (assumuing normailty)
Confidence interval of the Odds ratio Confidence interval of the Odds ratio
• ln (or) ± 1.96*SE(ln(or)) = 0.37 to 1.38
• Returning to the odds ratio itself:
• e0.370 to e1.379 = 1.45 to 3.97
• The interval does not contain 1, indicating a statistically significant difference
Bronchitis No bronchitis Total
Cough 26 (a) 44 (b) 70
No Cough 247 (c) 1002 (d) 1249
Total 273 1046 1319
Chi-square for goodness of fitChi-square for goodness of fit
• df = 4-1-1 = 2