ib math hl - charlotte county public schools · categorical data chi square tests are used for when...
TRANSCRIPT
![Page 1: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/1.jpg)
Chapter 25
Comparing Counts
![Page 2: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/2.jpg)
Objectives
Chi-Square Model
Chi-Square Statistic
Knowing when and how to use the Chi-
Square Tests;
Goodness of Fit
Test of Independence
Test of Homogeneity
Standardized Residual
![Page 3: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/3.jpg)
Categorical Data
Chi Square tests are used for when we have countsfor the categories of a categorical variable:
Goodness of Fit Test
Allows us to test whether a certain population distribution seems valid. This is a one variable, one sample test
Test of Independence
Cross categorizing one group on two-variables to see if there is an association between variables. This is a two variable, one sample test.
Test for Homogeneity
Compares observed distribution for several groups to each other to see if there is a difference among the population. This is a one variable, many samples test.
![Page 4: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/4.jpg)
Chi Square Model
Just like the student t-models, chi square has a family
of models depending on degrees of freedom.
Unlike the student t-models, a chi square distribution
is not symmetric. It’s skewed right.
A chi square test statistic is always a one-sided, right-
tailed test.
![Page 5: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/5.jpg)
The Chi-Square ( 2 ) Distribution - Properties
It is a continuous distribution.
It is not symmetric.
It is skewed to the right.
The distribution depends on the degrees of freedom.
The value of a 2 random variable is always
nonnegative.
There are infinitely many 2 distributions,
since each is uniquely defined by its degrees
of freedom.
![Page 6: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/6.jpg)
The Chi-Square ( 2 ) Distribution - Properties
For small sample size, the 2 distribution is
very skewed to the right.
As n increases, the 2 distribution becomes
more and more symmetrical.
![Page 7: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/7.jpg)
The Chi-Square ( 2 ) Distribution - Properties
Since we will be using the 2 distribution for
the tests in this chapter, we will need to be
able to find critical values associated with the
distribution.
![Page 8: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/8.jpg)
Critical Value
• Since we will be using the 2 distribution for the tests
in this chapter, we will need to be able to find critical
values associated with the distribution.
• Explanation of the term – critical or rejection region: A
critical or rejection region is a range of test statistic
values for which the null hypothesis will be rejected.
• This range of values will indicate that there is a
significant or large enough difference between the
postulated parameter value and the corresponding
point estimate for the parameter.
![Page 9: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/9.jpg)
Critical Value
• Explanation of the term – non-critical or non-rejection region: A non-critical or non-rejection region is a range of test statistic values for which the null hypothesis will not be rejected.
• This range of values will indicate that there is not a significant or large enough difference between the postulated parameter value and the corresponding point estimate for the parameter.
![Page 10: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/10.jpg)
Critical Value
Non-Critical Region(Non-Rejection Region)
(Rejection Region)
![Page 11: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/11.jpg)
The Chi-Square ( 2 ) Distribution - Properties
Notation: 2, df
Explanation of the notation 2, df: 2
, df is a
2 value with n degrees of freedom such that
(the significance level) area is to the right of
the corresponding 2 value.
![Page 12: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/12.jpg)
The Chi-Square ( 2 ) Distribution - Properties
Diagram explaining thenotation2
, df
![Page 13: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/13.jpg)
The Chi-Square ( 2 ) Distribution - Table
![Page 14: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/14.jpg)
The Chi-Square ( 2 ) Distribution - Table
Values for the random variable with the
appropriate degrees of freedom can be
obtained from the tables in the formula
booklet.
Example: What is the value of 20.05,10?
![Page 15: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/15.jpg)
The Chi-Square ( 2 ) Distribution - Table
α=.05
df=10
χ2 critical value
![Page 16: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/16.jpg)
The Chi-Square ( 2 ) Distribution - Table
Solution: From Table in the formula booklet, 2
0.05,10 = 18.307.
![Page 17: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/17.jpg)
The Chi-Square ( 2 ) Distribution - Table
Your Turn: What is the value of 20.10,20?
![Page 18: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/18.jpg)
The Chi-Square ( 2 ) Distribution - Table
20.10,20 = 28.41
![Page 19: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/19.jpg)
CHI-SQUARE (2) TEST
FOR GOODNESS OF FIT
![Page 20: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/20.jpg)
Goodness-of-Fit
A test of whether the distribution of counts in
one categorical variable matches the
distribution predicted by a model is called a
goodness-of-fit test.
As usual, there are assumptions and
conditions to consider…
![Page 21: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/21.jpg)
Assumptions and Conditions
Counted Data Condition: Check that the data
are counts for the categories of a categorical
variable.
Independence Assumption: The counts in
the cells should be independent of each
other.
Randomization Condition: The individuals who
have been counted and whose counts are
available for analysis should be a random
sample from some population.
![Page 22: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/22.jpg)
Assumptions and Conditions
Sample Size Assumption: We must have
enough data for the methods to work.
Expected Cell Frequency Condition: We
should expect to see at least 5 individuals in
each cell.
This is similar to the condition that np
and nq be at least 10 when we tested
proportions.
![Page 23: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/23.jpg)
Calculations
Since we want to examine how well the
observed data reflect what would be
expected, it is natural to look at the
differences between the observed and
expected counts (Obs – Exp).
![Page 24: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/24.jpg)
Calculations (cont.)
The test statistic, called the chi-square (or
chi-squared) statistic, is found by adding up
the sum of the squares of the deviations
between the observed and expected counts
divided by the expected counts:
2
2
all cells
Obs Exp
Exp
![Page 25: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/25.jpg)
One-Sided or Two-Sided?
The chi-square statistic is used only for testing hypotheses, not for constructing confidence intervals.
If the observed counts don’t match the expected, the statistic will be large—it can’t be “too small.”
So the chi-square test is always one-sided.
If the calculated statistic value is large enough, we’ll reject the null hypothesis.
![Page 26: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/26.jpg)
One-Sided or Two-Sided?
The mechanics may work like a one-sided test, but the interpretation of a chi-square test is in some ways many-sided.
There are many ways the null hypothesis could be wrong.
There’s no direction to the rejection of the null model—all we know is that it doesn’t fit.
![Page 27: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/27.jpg)
Procedure
![Page 28: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/28.jpg)
Procedure (cont.)
![Page 29: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/29.jpg)
Expected Frequencies
If all expected frequencies are not all equal:
each expected frequency is found by
multiplying the sum of all observed
frequencies by the probability for the
category
E = n p
![Page 30: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/30.jpg)
Expected Frequencies
The chi-square goodness of fit test is always a right-tailed test.
For the chi-square goodness-of-fit test, the expected frequencies should be at least 5.
When the expected frequency of a class or category is less than 5, this class or category can be combined with another class or category so that the expected frequency is at least 5.
![Page 31: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/31.jpg)
Goodness-of-fit Test
Test Statistic
Critical Values
1. Found in Table using k – 1 degrees of
freedom where k = number of categories
2. Goodness-of-fit hypothesis tests are
always right-tailed.
2= (O – E)2
E
![Page 32: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/32.jpg)
EXAMPLE
There are 4 TV sets that are located in the student center of a large university. At a particular time each day, four different soap operas (1, 2, 3, and 4) are viewed on these TV sets. The percentages of the audience captured by these shows during one semester were 25 percent, 30 percent, 25 percent, and 20 percent, respectively. During the first week of the following semester, 300 students are surveyed.
![Page 33: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/33.jpg)
EXAMPLE (Continued)
(a) If the viewing pattern has not changed, what number of students is expected to watch each soap opera?
Solution: Based on the information, the expected values will be: 0.25300 = 75, 0.30300 = 90, 0.25300 = 75, and 0.20300 = 60.
![Page 34: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/34.jpg)
EXAMPLE (Continued)
(b) Suppose that the actual observed numbers of students viewing the soap operas are given in the following table, test whether these numbers indicate a change at the 1 percent level of significance.
![Page 35: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/35.jpg)
EXAMPLE (Continued)
Solution: Given = 0.01, n = 4, df = 4 – 1 = 3, 2
0.01, 3= 11.345. The observed and expected frequencies are given below
![Page 36: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/36.jpg)
EXAMPLE (Continued)
Solution (continued): The 2 test statistic is computed below.
![Page 37: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/37.jpg)
EXAMPLE (Continued)
Solution (continued):
P-value = .6828, P > 𝛼
![Page 38: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/38.jpg)
EXAMPLE (Continued)
Solution (continued):
Diagram showing
the rejection
region.
![Page 39: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/39.jpg)
The Chi-Square test for Goodness of Fit
![Page 40: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/40.jpg)
Your Turn
The Advanced Placement (AP) Statistics examination was first administered in May 1997. Students’ papers are graded on a scale of 1–5, with 5 being the highest score. Over 7,600 students took the exam in the first year, and the distribution of scores was as follows (not including exams that were scored late).
Score 5 4 3 2 1 .
Percent 15.3 22.0 24.8 19.8 18.1
A distance learning class that took AP Statistics via satellite television had the following distribution of grades:
Score 5 4 3 2 1 .
Frequency 7 13 7 6 2
![Page 41: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/41.jpg)
Score Observed
Counts
Expected %
(pi)
Expected
Counts (npi)
5 7 15.3 5.355 .50533
4 13 22 7.7 3.6481
3 7 24.8 8.68 .32516
2 6 19.8 6.93 .12481
1 2 18.1 6.335 2.9664
Totals 35 100% 35 7.56976
2
O E
E
![Page 42: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/42.jpg)
Carry out an appropriate test to determine if
the distribution of scores for students enrolled
in the distance learning program is
significantly different from the distribution of
scores for all students who took the inaugural
exam.
![Page 43: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/43.jpg)
We must be willing to treat this class of students as an SRS from the population of all
distance learning classes. We will proceed with caution. All expected counts are 5 or more.
Ho: The distribution of AP Statistics exams scores for distance learning students is the same as the distribution of scores for all students who took the May 1997 exam.
Ha:The distribution of AP Statistics exams scores for distance learning students is different than the distribution of scores for all students who took the May 1997 exam.
![Page 44: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/44.jpg)
We will use a significance level of 0.05. There are 5 categories, meaning there are 5 – 1 = 4 degrees of freedom.
P-value = .1087
We do not have enough evidence to reject Ho since
p > alpha. We do not have enough evidence to suggest the distributions of scores of traditional students is different than the distribution of scores of the distance learning students.
24 7.56976
24( 7.56976)P
![Page 45: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/45.jpg)
2 TEST OF INDEPENDENCE
![Page 46: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/46.jpg)
Independence
Contingency tables categorize counts on two (or more) variables so that we can see whether the distribution of counts on one variable is contingent on the other.
A test of whether the two categorical variables are independent examines the distribution of counts for one group of individuals classified according to both variables in a contingency table.
![Page 47: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/47.jpg)
Definition
Test of Independence
This method tests the null hypothesis that the row variable and column variable in a contingency table are not related. (The null hypothesis is the statement that the row and column variables are independent.)
![Page 48: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/48.jpg)
Assumptions and Conditions
The assumptions and conditions are the same as for the chi-square goodness-of-fit test:
Counted Data Condition: The data must be counts.
Randomization Condition and 10% Condition:As long as we don’t want to generalize, we don’t have to check these conditions.
Expected Cell Frequency Condition: The expected count in each cell must be at least 5.
![Page 49: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/49.jpg)
Test of Independence
Test Statistic
Critical Values
1. Found in Table using
degrees of freedom = (r – 1)(c – 1)
r is the number of rows and c is the number of
columns
2. Tests of Independence are always right-
tailed.
2= (O – E)2
E
![Page 50: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/50.jpg)
Tests of Independence
H0: The row variable is independent of the
column variable
H1: The row variable is dependent (related to) the column variable
This procedure cannot be used to establish a direct cause-and-effect link between variables in question.
Dependence means only there is a relationshipbetween the two variables.
![Page 51: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/51.jpg)
Expected Frequency for Contingency Tables
E = • •table total
row total column total
table totaltable total
E = (row total) (column total)
(table total)
(probability of a cell)
n • p
![Page 52: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/52.jpg)
(row total) (column total)
(table total)E =
Total number of all observed frequencies
in the table
![Page 53: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/53.jpg)
Observed and Expected Frequencies
332
1360
1692
318
104
422
29
35
64
27
18
45
706
1517
2223
Men Women Boys Girls Total
Survived
Died
Total
We will use the mortality table from the Titanic to find expected
frequencies. For the upper left hand cell, we find:
= 537.360E =(706)(1692)
2223
![Page 54: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/54.jpg)
332
537.360
1360
1692
318
104
422
29
35
64
27
18
45
706
1517
2223
Men Women Boys Girls Total
Survived
Died
Total
Find the expected frequency for the lower left hand cell, assuming
independence between the row variable and the column variable.
= 1154.640E =(1517)(1692)
2223
Observed and Expected Frequencies
![Page 55: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/55.jpg)
332
537.360
1360
1154.64
1692
318
134.022
104
287.978
422
29
20.326
35
43.674
64
27
14.291
18
30.709
45
706
1517
2223
Men Women Boys Girls Total
Survived
Died
Total
To interpret this result for the lower left hand cell, we can say that although 1360
men actually died, we would have expected 1154.64 men to die if survivablility is
independent of whether the person is a man, woman, boy, or girl.
Observed and Expected Frequencies
![Page 56: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/56.jpg)
Example: Using a 0.05 significance level, test the claim
that when the Titanic sank, whether someone survived or
died is independent of whether that person is a man,
woman, boy, or girl.
H0: Whether a person survived is independent of whether
the person is a man, woman, boy, or girl.
H1: Surviving the Titanic and being a man, woman, boy,
or girl are dependent.
![Page 57: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/57.jpg)
Example: Using a 0.05 significance level, test the claim
that when the Titanic sank, whether someone survived or
died is independent of whether that person is a man,
woman, boy, or girl.
2= (332–537.36)2 + (318–132.022)2 + (29–20.326)2 + (27–14.291)2
537.36 134.022 20.326 14.291
+ (1360–1154.64)2 + (104–287.978)2 + (35–43.674)2 + (18–30.709)2
1154.64 287.978 43.674 30.709
2=78.481 + 252.555 + 3.702+11.302+36.525+117.536+1.723+5.260
= 507.084
![Page 58: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/58.jpg)
Example: Using a 0.05 significance level, test the claim
that when the Titanic sank, whether someone survived or
died is independent of whether that person is a man,
woman, boy, or girl.
The number of degrees of freedom are (r–1)(c–1)=
(2–1)(4–1)=3.
Critical value: 2*.05,3 = 7.815. 507.084 > 7.815
We reject the null hypothesis.
P-value: P = P(2 > 507.084) = 0. P < 𝛼.We reject the null hypothesis.
Survival and gender are dependent.
![Page 59: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/59.jpg)
Test Statistic 2= 507.084
with = 0.05 and (r – 1) (c– 1) = (2 – 1) (4 – 1) = 3 degrees of freedom
Critical Value 2= 7.815 (from Table )
![Page 60: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/60.jpg)
Procedure
![Page 61: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/61.jpg)
Procedure (cont.)
![Page 62: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/62.jpg)
EXAMPLE
A survey was done by a car manufacturer concerning a particular make and model. A group of 500 potential customers were asked whether they purchased their current car because of its appearance, its performance rating, or its fixed price (no negotiating). The results, broken down by gender responses, are given on the next slide.
![Page 63: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/63.jpg)
EXAMPLE (Continued)
Question: Do females feel differently than males about the three different criteria used in choosing a car, or do they feel basically the same?
![Page 64: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/64.jpg)
Solution
χ2 Test for independence.
Thus the null hypothesis will be that the criterion used is independent of gender, while the alternative hypothesis will be that the criterion used is dependent on gender.
![Page 65: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/65.jpg)
Solution (continued)
The degrees of freedom is given by (number of rows – 1)(number of columns –1).
df = (2 – 1)(3 – 1) = 2.
![Page 66: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/66.jpg)
Solution (continued)
Calculate the row and column totals. These row and column are called marginal totals.
![Page 67: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/67.jpg)
Solution (continued)
Computation of the expected values
The expected value for a cell is the row total times the column total divided by the table total.
![Page 68: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/68.jpg)
Solution (continued)
Let us use = 0.01. So df = (2 –1)(3 –1) = 2 and 20.01,
2 = 9.210.
![Page 69: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/69.jpg)
Solution (continued)
The 2 test statistic is computed in the same manner as was done for the goodness-of-fit test.
![Page 70: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/70.jpg)
Solution (continued)
![Page 71: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/71.jpg)
Solution (continued)
Diagram showing the rejection region.
![Page 72: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/72.jpg)
Test of Homogeneity
![Page 73: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/73.jpg)
Comparing Observed Distributions
A test comparing the distribution of counts for
two or more groups on the same categorical
variable is called a chi-square test of
homogeneity.
A test of homogeneity is actually the
generalization of the two-proportion z-test.
![Page 74: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/74.jpg)
Comparing Observed Distributions (cont.)
The statistic that we calculate for this test is identical to the chi-square statistic for independence.
In this test, however, we ask whether choices are the same among different groups (i.e., there is no model).
The expected counts are found directly from the data and we have different degrees of freedom.
![Page 75: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/75.jpg)
Assumptions and Conditions
The assumptions and conditions are the same as for the chi-square goodness-of-fit test:
Counted Data Condition: The data must be counts.
Randomization Condition and 10% Condition:As long as we don’t want to generalize, we don’t have to check these conditions.
Expected Cell Frequency Condition: The expected count in each cell must be at least 5.
![Page 76: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/76.jpg)
Test for Homogeneity
In a chi-square test for homogeneity of
proportions, we test whether different
populations have the same proportion of
individuals with some characteristic.
The procedures for performing a test of
homogeneity are identical to those for a test
of independence.
![Page 77: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/77.jpg)
Example:
The following question was asked of a random sample of individuals in 1992, 2002, and 2008: “Would you tell me if you feel being a teacher is an occupation of very great prestige?” The results of the survey are presented below:
Test the claim that the proportion of individuals that feel being a teacher is an occupation of very great prestige is the same for each year at the = 0.01 level of significance.
1992 2002 2008
Yes 418 479 525
No 602 541 485
![Page 78: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/78.jpg)
Solution
Step 1: The null hypothesis is a statement of “no difference” so the proportions for each year who feel that being a teacher is an occupation of very great prestige are equal. We state the hypotheses as follows:
H0: p1992= p2002= p2008
H1: At least one of the proportions is different from the others.
Step 2: The level of significance is =0.01.
![Page 79: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/79.jpg)
Solution
Step 3:
(a) The expected frequencies are found by
multiplying the appropriate row and column
totals and then dividing by the total sample
size. They are given in parentheses in the
table below, along with the observed
frequencies.
1992 2002 2008
Yes418
(475.554)
479
(475.554)
525
(470.892)
No602
(544.446)
541
(544.446)
485
(539.108)
![Page 80: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/80.jpg)
Solution
Step 3:
(b) Since none of the expected frequencies are
less than 5, the requirements are satisfied.
(c) The test statistic is
02
418 475.554 2
475.554479 475.554
2
475.554
485 539.108
2
539.108
24.74
![Page 81: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/81.jpg)
Solution: Classical Approach
Step 4: There are r = 2 rows and c =3
columns, so we find the critical
value using (2-1)(3-1) = 2 degrees
of freedom.
The critical value is .
0.012 9.210
![Page 82: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/82.jpg)
Solution: Classical Approach
Step 5: Since the test statistic,
is greater than the critical value
, we reject the null hypothesis.
02 24.74
0.012 9.210
![Page 83: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/83.jpg)
Solution: P-Value Approach
Step 4: There are r = 2 rows and c =3
columns so we find the P-value using
(2-1)(3-1) = 2 degrees of freedom.
The P-value is the area under the chi-
square distribution with 2 degrees of
freedom to the right of
which is approximately 0.
02 24.74
![Page 84: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/84.jpg)
Solution: P-Value Approach
Step 5: Since the P-value is less than the
level of significance = 0.01, we
reject the null hypothesis.
![Page 85: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/85.jpg)
Solution
Step 6: There is sufficient evidence to
reject the null hypothesis at the =
0.01 level of significance. We
conclude that the proportion of
individuals who believe that
teaching is a very prestigious career
is different for at least one of the
three years.
![Page 86: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/86.jpg)
Example: Should Dentist Advertise?
It may seem hard to believe but until the
1970’s most professional organizations
prohibited their members from advertising. In
1977, the U.S. Supreme Court ruled that
prohibiting doctors and lawyers from
advertising violated their free speech rights.
![Page 87: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/87.jpg)
Should Dentist Advertise?
The paper “Should Dentist Advertise?” (J. of
Advertising Research (June 1982): 33 – 38)
compared the attitudes of consumers and
dentists toward the advertising of dental
services. Separate samples of 101
consumers and 124 dentists were asked to
respond to the following statement: “I favor
the use of advertising by dentists to attract
new patients.”
![Page 88: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/88.jpg)
Example: Should Dentist Advertise?
Possible responses were: strongly agree,
agree, neutral, disagree, strongly disagree.
The authors were interested in determining
whether the two groups—dentists and
consumers—differed in their attitudes toward
advertising.
![Page 89: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/89.jpg)
Example: Should Dentist Advertise?
This is a done by a chi-squared test of
homogeneity, that is we are testing the claim
that different populations have the same ratio
across some second variable characteristic.
So how should we state the null and
alternative hypotheses for this test?
![Page 90: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/90.jpg)
Example: Should Dentist Advertise?
H0:
Ha:
The true category proportions for all
responses are the same for both populations
of consumers and dentists.
The true category proportions for all
responses are not the same for both
populations of consumers and dentists.
![Page 91: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/91.jpg)
Observed Data
Agree Neutral Disagree
Consumers 34 49 9 4 5
Dentists 9 18 23 28 46
43 67 32 32 51
Strongly
Agree
Strongly
Disagree
Response
Group
• How do we determine the expected cell count under the assumption of homogeneity?
• That’s right, the expected cell counts are estimated from the sample data (assuming
that H0 is true) by using …
expected row marginal total column marginal total
cell count the total sample size
101
124
225
![Page 92: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/92.jpg)
Expected Values
Agree Neutral Disagree
Consumers 34 49 9 4 5
Dentists 9 18 23 28 46
43 67 32 32 51
Strongly
Agree
Strongly
Disagree
Response
Group
• So the calculation for the first cell is …
st 101 431 expected19.302
225cell count
19.30101
124
225
![Page 93: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/93.jpg)
Expected Values
Agree Neutral Disagree
Consumers 34 49 9 4 5
Dentists 9 18 23 28 46
43 67 32 32 51
Strongly
Agree
Strongly
Disagree
Response
Group
19.30
23.70
30.08
36.92 17.64
14.36
28.11
101
124
225
14.36
17.64
22.89
![Page 94: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/94.jpg)
Test Statistic
Now we can calculate the 2 test statistic:
2
2Observed Count Expected Count
Expected Count
2 2 2
34 19.30 49 30.08 46 28.11...
19.30 30.08 28.11
11.20 11.90 2.00 ... 11.39 84.47
![Page 95: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/95.jpg)
Sampling Distribution
The two-way table for this situation has 2
rows and 5 columns, so the appropriate
degrees of freedom is (2 – 1)(5 – 1) = 4.
Chi-Squared critical value: 𝜒2*= 9.49. 𝜒2 (84.47) > 𝜒2* (9.49), Reject the null hypothesis.
![Page 96: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/96.jpg)
P-value
P-value: P = P(𝜒2 > 84.47) ≈ 0. Reject the null
hypothesis.
Conclusion: With a P-value ≈ 0, reject the
null hypothesis. The true category proportions
for all responses are not the same for both
populations of consumers and dentists.
![Page 97: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/97.jpg)
Homogeneity of Proportions
An advertising firm has decided to ask 92
customers at each of three local shopping
malls if they are willing to take part in a
market research survey. According to
previous studies, 38% of Americans refuse to
take part in such surveys. At α = 0.01, test the
claim that the proportions are equal.
![Page 98: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/98.jpg)
Homogeneity of Proportions
Step 1
Ho: p1 = p2 = p3
Ha: At least one
is different
Step 2
α = 0.01
Step 32
)2(
Mall
A
Mall
B
Mall
C
Total
Will
Partici
pate
52 45 36 133
Will
not
partici
pate
40 47 56 143
Total 92 92 92 276
![Page 99: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/99.jpg)
Homogeneity of Proportions
Step 4
Put into your calculator
Observed in matrix A
Expected in matrix B
Test statistic = 5.602
P-value = 0.06
![Page 100: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/100.jpg)
Homogeneity of Proportions
Step 5
Do Not Reject Ho
Step 6
There is not sufficient evidence to suggest that
at least one is different.
![Page 101: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/101.jpg)
Chi-Square and Causation
Chi-square tests are common, and tests for independence are especially widespread.
We need to remember that a small P-value is notproof of causation.
Since the chi-square test for independence treats the two variables symmetrically, we cannot differentiate the direction of any possible causation even if it existed.
And, there’s never any way to eliminate the possibility that a lurking variable is responsible for the lack of independence.
![Page 102: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/102.jpg)
Chi-Square and Causation (cont.)
In some ways, a failure of independence
between two categorical variables is less
impressive than a strong, consistent, linear
association between quantitative variables.
Two categorical variables can fail the test of
independence in many ways.
Examining the standardized residuals can help
you think about the underlying patterns.
![Page 103: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/103.jpg)
CHI-SQUARE INFERENCE
TEST FOR GOODNESS OF FIT
• Used to determine if a particular population distribution fits a specified form
HYPOTHESES:
H0: Actual population percents are equal to
hypothesized percentages
Ha: Actual population percents are different from
hypothesized percentages
![Page 104: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/104.jpg)
CHI-SQUARE INFERENCE
TEST FOR INDEPENDENCE
• Used to determine if two variables within a single population are independent
HYPOTHESES:
H0: There is no relationship between the two variables
in the population
Ha: There is a dependent relationship between the two
variables in the population
![Page 105: IB Math HL - Charlotte County Public Schools · Categorical Data Chi Square tests are used for when we have counts for the categories of a categorical variable: Goodness of Fit Test](https://reader033.vdocuments.site/reader033/viewer/2022042207/5eaa8798306ce6073623b88b/html5/thumbnails/105.jpg)
CHI-SQUARE INFERENCE
TEST FOR HOMOGENEITY
• Used to determine if two separate populations are similar in respect to a single variable
HYPOTHESES:
H0: There are no differences among proportions of
success in the populations
Ha: There are differences among proportions of
success in the populations