the chi-square test for association

28
The Chi-Square Test for Association Math 137 Fall‘13 L. Burger 1

Upload: lam

Post on 23-Feb-2016

30 views

Category:

Documents


1 download

DESCRIPTION

The Chi-Square Test for Association. Math 137 Fall‘13 L. Burger. -The Chi Square Test. A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted from a hypothesis Note: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Chi-Square Test for Association

1

The Chi-Square Test for AssociationMath 137

Fall‘13L. Burger

Page 2: The Chi-Square Test for Association

-The Chi Square Test• A statistical method used to determine goodness

of fit– Goodness of fit refers to how close the observed data

are to those predicted from a hypothesis• Note:– The chi square test does not prove that a hypothesis is

correct• It evaluates to what extent the data and the hypothesis have

a good fit

Page 3: The Chi-Square Test for Association

3

-Determine The Hypothesis:Whether There is an Association or Not

• Ho : The two variables are independent (not matching up well!)

• Ha : The two variables are associated (variables matching good --- called a ‘good fit!’)

Page 4: The Chi-Square Test for Association

Understanding the Chi-Square Distribution

• You see that there is a range here: if the results were perfect you get a chi-square value of 0 (because obs = exp). This rarely happens: most experiments give a small chi-square value (the hump in the graph).

• Note that all the values are greater than 0: that's because we squared the (obs - exp) term: squaring always gives a non-negative number.

• Sometimes you get really wild results, with obs very different from exp: the long tail on the graph. Really odd things occasionally do happen by chance alone (for instance, you might win the lottery).

Page 5: The Chi-Square Test for Association

Critical Chi-Square• Critical values for chi-square are found on tables,

sorted by degrees of freedom and probability levels. Be sure to use p = 0.05.

• If your calculated chi-square value is greater than the critical value from the table, you “reject the null hypothesis”.

• If your chi-square value is less than the critical value, you “fail to reject” the null hypothesis (that is, you accept that your theory about the expected ratio is correct).

Page 6: The Chi-Square Test for Association

6

-Calculating Test Statistics• Contrasts observed frequencies in each cell of a

contingency table with expected frequencies.• The expected frequencies represent the number of

cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated).

• Expected frequency of two unrelated events is product of the row and column frequency divided by number of cases.

Fe= Fr Fc / N

Page 7: The Chi-Square Test for Association

7

-Calculating Test Statistics

e

eo

FFF 2

2 )(

Page 8: The Chi-Square Test for Association

8

4. Calculating Test Statistics

e

eo

FFF 2

2 )(

Observedfrequencies

Expe

cted

frequ

ency

Expected

frequency

Page 9: The Chi-Square Test for Association

9

-Determine Degrees of Freedom

df = (R-1)(C-1)

Num

ber of

levels in column

variable

Num

ber of levels in row

variable

Page 10: The Chi-Square Test for Association

10

-Compare computed test statistic against a tabled/critical value

• The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable

• The critical tabled values are based on sampling distributions of the Pearson chi-square statistic

• If calculated 2 is greater than 2 table value, reject Ho

Page 11: The Chi-Square Test for Association

Example 1: One-dimensional

• Suppose we want to know how people in a particular area will vote in general and go around asking them.

• How will we go about seeing what’s really going on?

Republican Democrat Other

20 30 10

Page 12: The Chi-Square Test for Association

-Determine Hypotheses

• Ho : There is no difference between what is observed and what is expected.

• : There is an association between the observed and expected frequencies.

• Solution: chi-square analysis to determine if our outcome is different from what would be expected if there was no preference

22 ( )O E

E

Page 13: The Chi-Square Test for Association

• Plug in to formula

Republican Democrat Other

Observed 20 30 10Expected 20 20 20

2 2 2(20 20) (30 20) (10 20)20 20 20

Page 14: The Chi-Square Test for Association

• Reject H0

• The district will probably vote democratic, there is association between what is observed and what is expected.

• However…

2

2.05

(2) 10

5.99

Page 15: The Chi-Square Test for Association

Conclusion of Example 1• Note that all we really can conclude is that our data

is different from the expected outcome given a situation– Although it would appear that the district will vote

democratic, really we can only conclude they were not responding by chance

– Regardless of the position of the frequencies we’d have come up with the same result

– In other words, it is a non-directional test regardless of the prediction

Page 16: The Chi-Square Test for Association

16

Example 2-Two Dimensional

• Suppose a researcher is interested in voting preferences on gun control issues.

• A questionnaire was developed and sent to a random sample of 90 voters.

• The researcher also collects information about the political party membership of the sample of 90 respondents.

Page 17: The Chi-Square Test for Association

17

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Page 18: The Chi-Square Test for Association

18

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Observe

d

frequ

encie

s

Page 19: The Chi-Square Test for Association

19

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90

Row

frequency

Page 20: The Chi-Square Test for Association

20

Bivariate Frequency Table or Contingency Table

Favor Neutral Oppose f row

Democrat 10 10 30 50

Republican 15 15 10 40

f column 25 25 40 n = 90Column frequency

Page 21: The Chi-Square Test for Association

21

-Determine The Hypothesis

• Ho : There is no difference between D & R in their opinion on gun control issue.

• Ha : There is an association between responses to the gun control survey and the party membership in the population.

Page 22: The Chi-Square Test for Association

22

-Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

Page 23: The Chi-Square Test for Association

23

-Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 50*25/90

Page 24: The Chi-Square Test for Association

24

-Calculating Test Statistics

Favor Neutral Oppose f row

Democrat fo =10fe =13.9

fo =10fe =13.9

fo =30fe=22.2

50

Republican fo =15fe =11.1

fo =15fe =11.1

fo =10fe =17.8

40

f column 25 25 40 n = 90

= 40* 25/90

Page 25: The Chi-Square Test for Association

25

-Calculating Test Statistics

8.17)8.1710(

11.11)11.1115(

11.11)11.1115(

2.22)2.2230(

89.13)89.1310(

89.13)89.1310(

222

2222

= 11.03

Page 26: The Chi-Square Test for Association

26

-Determine Degrees of Freedom

df = (R-1)(C-1) =(2-1)(3-1) = 2

Page 27: The Chi-Square Test for Association

27

-Compare computed test statistic against a tabled/critical value

• α = 0.05• df = 2• Critical tabled value = 5.991• Test statistic, 11.03, exceeds critical value• Null hypothesis is rejected

Page 28: The Chi-Square Test for Association

28

-Conclusion of Example 2

-Democrats & Republicans differ significantly in their opinions on gun control issues

• Ho : There is no difference between D & R in their opinion on gun control issue.

• Ha : There is an association between responses to the gun control survey and the party membership in the population.