chi square - byu linguistics & english...

15
Chi square Chi-square uses categorical data. It looks at how many things fall into different categories, and calculates whether the probability of obtaining those count is significantly different from random. Example: Number of responses on a multiple choice test. Jill loves the taste of coffee . If she were able to she would ___ all her food so it tasted like coffee. Response Vowel in Response Choice # of Observed # of Expected Choices Responses Responses c[]ffify Vowel not found in coffee or caffeine 70 123 c[ ]ffify ɑ Vowel found in coffee 113 123 c[æ]ffify Vowel found in caffeine 186 123 Is 186, 113, 70 really different from what random choice would give ( i.e. 123, 123, 123)? Chi-square produces two numbers, a Χ 2 value and a p value. The data we are considering yield a Χ 2 of 55.92 and a p value of .0005. The small p value indicates that the chances of getting the 70, 113, 186 distribution by chance is extremely unlikely, so it must be due to something else besides chance.

Upload: vuongminh

Post on 07-Sep-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Chi squareChi-square uses categorical data. It looks at how many things fall into different categories, and

calculates whether the probability of obtaining those count is significantly different from random.

Example:

Number of responses on a multiple choice test.

Jill loves the taste of coffee. If she were able to she would ___ all her food so it tasted

like coffee.

Response Vowel in Response Choice # of Observed # of ExpectedChoices Responses Responses

c[]ffify Vowel not found in coffee or caffeine 70 123c[ ]ffifyɑ Vowel found in coffee 113 123c[æ]ffify Vowel found in caffeine 186 123

Is 186, 113, 70 really different from what random choice would give ( i.e. 123, 123, 123)?

Chi-square produces two numbers, a Χ2 value and a p value. The data we are considering yield a Χ2 of

55.92 and a p value of .0005. The small p value indicates that the chances of getting the 70, 113, 186

distribution by chance is extremely unlikely, so it must be due to something else besides chance.

Page 2: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

To analyze the 1 x 3 coffee/caffeine data:

Specify the dependent variable:

Click on Data > Weight Cases. Click on Weight Cases by then on the VowelCount

variable in the left-hand window. Click on the arrow to move it to the box under

Frequency Variable > OK (Figure 4.3).

Figure 4.3. Weight Cases dialog box.

Page 3: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Carry out the Chi-square:

Click on Analyze > Nonparametric Tests > Legacy Dialogs > Chi Square (Figure 4.4).

Figure 4.4. Finding the Chi-square menu.

Page 4: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Move Vowel to Test Variable List by clicking on Vowel then on the arrow between the

boxes > OK (Figure 4.5).

Figure 4.5. Chi-square dialog box.

Page 5: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

SPSS produces the data in Table 4.2. (Disregard the error about the weight value. SPSS just

doesn't like the fact that the data columns in the spreadsheet don't have equal numbers of rows.)

Table 4.2. SPSS Chi-square output for the coffee/caffeine study.

Vowel Test StatisticsObserved N Expected N Residual Vowel

c^ffify 70 123.0 -53.0 Chi-Square 55.919cahffify 113 123.0 -10.0 df 2caeffify 186 123.0 63.0 Asymp.Sig. .000Total 369

Page 6: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Example problem:

In Linguistics courses the enrollments are 300 men and 75 women. Are men more likely to study linguistics than women? (Check in calculator.)

Problem: 60% men and 40% women in university.

What are the expected enrollments with 375 students? (Use calculator that lets you put in expected by hand.)

What are the output of a chi-square test?

X2 when high indicates that the numbers in each category are not obtainable by random distribution.

Degrees of freedom (df) are needed to interpret X2. The degrees of freedom are the number of rows minus one, plus the number or columns minus one. So in a 3 by 2 table df =(3-1)+(2-1)=3

How do you report these numbers? If Chi-square is 99.56, df=4, and p is smaller than 0.05 then:

X2 (4) = 99.56, p < 0.05.

Page 7: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

The research question is whether snuck, the irregular past tense of sneak, is an Americanism. In other

words, what is the effect of the variety of English on the use of snuck versus sneaked. The data in Table

4.3 come from the British National Corpus (BNC) and the Corpus of Contemporary American English

(COCA).1

Table 4.3. Frequencies of sneaked and snuck in COCA and the BNC.

COCA BNC

sneaked 869 125snuck 896 10

As you can see, there are quite a few more cases of both variants of the past tense of sneak in

COCA. This is not surprising since COCA is about four and a half times as large as the BNC.

However, there is no need to compensate for the size difference between the corpora since we are

comparing each of the two extant past tenses across corpora. If, on the other hand, we were asking if

snuck is used more in COCA compared to the BNC (896 and 10 instances), that would be put in a 1 x 2

table, where COCA would surely have more instances given its larger size.2 Considering both variants

across corpora is preferred.

In any event, the Chi-square on these data comes out significant (Χ2 (2) = 94.50, p < .0005),

but what exactly is significant? To answer this, it is important to compare the observed scores with the

expected scores that the program calculates. (The expected scores in a two dimensional table like this is

not calculated by dividing the sum of the counts in the cells by the number of cells. I forgo explaining

exactly how the calculations are made in order to keep my promise of keeping the math to the

1 Corpus.byu.edu2 In this case, the data from each corpus would need to be equalized to compensate for the differences in corpus size. One way to achieve this would be to compare how often each word occurs per million in each corpus. Since Chi-square requires counts never proportions or percentages, the per million numbers 1.82 and .1 (COCA and BNC) can't be used. However, multiplying each by 10 and rounding to whole numbers yields 18 and 1 which could be compared in a Chi-square.

Page 8: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

minimum. Let the computer do it for you.) In Table 4.4, large boldface numbers indicate observed

counts that are larger than expected counts. Small fonts, on the other hand show counts that are smaller

than expected. Therefore, sneaked does appear to be significantly more British and snuck more

American.

Table 4.4. Observed and expected frequencies of sneaked and snuck in COCA and the BNC.

Observed Expected

COCA BNC COCA BNC

sneaked 869 125 923.4 70.6snuck 896 10 841.6 64.4

Chi-square answers the question: Do the observed counts differ significantly from what would

be expected by chance? However, it does not give any sense of which cells in the table were most

influential nor what the degree of association between the variables is. That is, it doesn't answer the

question: How strongly related are the uses of sneaked and snuck to British or American English? This

is where Cramer's V is helpful. It measures strength of relationship on a 0 to 1 scale where 0 indicates

no relationship and numbers close to 1 an extremely strong one. In the case of sneaked and snuck the

Cramer's V is fairly weak at .233 so the relationship between the different past tense forms and the

variety of English is not very strong.

Determining which cells account for most of the significance of the chi-square is a matter of

looking at the standard residual for each cell in Table 4.5. Observations with standardized residuals that

are 1.96 or higher or -1.96 or lower fit this bill at the .05 level, while the more stringent +/-2.58

corresponds with the .01 level of significance. As far as sneaked and snuck are concerned, it is the

difference between those two past tense forms in the BNC (6.5, -6.8) that is most responsible for the

Page 9: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

significant chi-square. The standardized residuals from COCA don't reach significance.

Page 10: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Using SPSS to Calculate a 2 x 2 Chi-square

Specify the dependent variable:

Click on Data > Weight Cases. Click on Weight Cases by then on the CorpusCount

variable in the left-hand window. Click on the arrow to move it to the box under

Frequency Variable > OK.

Carry out the Chi square:

Click on Analyze > Descriptives > Crosstabs. Move Corpus into Row(s) and Answer

into Column(s). Click on Cells, check Observed, Expected, Standardized > Continue

(Figure 4.6).

Figure 4.6. Crosstabs dialog box.

Page 11: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Click on Statistics, choose Chi-square and Phi and Cramer's V > Continue > OK (Figure

4.7).

Figure 4.7. Crosstabs Statistics dialog box.

Page 12: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

SPSS generates the data in Tables 4.5, 4.6 and 4.7:

Table 4.5. SPSS table of observed and expected frequencies for the sneaked/snuck study.

Corpus * Answer Crosstabulation

Answer Total

sneaked snuck

Corpus BNC Count 125 10 135

Expected Count 70.6 64.4 135.0

Std. Residual 6.5 -6.8

COCA Count 869 896 1765

Expected Count 923.4 841.6 1765.0

Std. Residual -1.8 1.9

Total Count 994 906 1900

Expected Count 994.0 906.0 1900.0

Table 4.6. SPSS Chi-square output for the sneaked/snuck study.

Chi-Square Tests

Value df Asymp. Sig. (2-sided)

Exact Sig. (2-sided)

Exact Sig. (1-sided)

Pearson Chi-Square 94.503a 1 .000

Continuity Correctionb 92.773 1 .000

Likelihood Ratio 112.191 1 .000

Fisher's Exact Test .000 .000

Linear-by-Linear Association

94.453 1 .000

N of Valid Cases 1900

a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 64.37.

b. Computed only for a 2x2 table

Page 13: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Table 4.7. SPSS Chi-square measures of association for the sneaked/snuck study.

Symmetric Measures

Value Approx. Sig.

Nominal by Nominal Phi .223 .000

Cramer's V .223 .000

N of Valid Cases 1900

Assumptions of Chi-square

Since Chi-square is not a parametric statistic the data do not need to follow a regular

distribution. However, the observations must be independent of each other. In an experiment this would

mean that one person can only give one answer. In the sneaked/snuck study this means that the data can

only come from one corpus, not both. Studies that test people at two different points in time shouldn't

be analyzed with Chi-square because they would violate independence. The numbers in a Chi-square

analysis must also be whole number counts, never percentages or proportions or you will have your

hand slapped. Furthermore, the expected value in each cell must be at least five.

What if the expected value is less than five? Look back at the bottom Table 4.6. For that

particular Chi-square it indicates that 0 cells (0.0%) have expected count less than 5 so we are safe

using Chi-square. However, if there were cells with fewer than five expected counts, we could use a

FISHER'S EXACT TEST instead. The p value for the Fisher's appears in Table 4.6 and has a p value of .

000 (read .0005). Note that SPSS does not give Fisher's a value nor a degrees of freedom, just a p

value.

Page 14: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

Hands on exercises for Chi square

Judeo Spanish Sibilant voicing

At an earlier stage in its history, Spanish distinguished between /s/ and /z/ in words such as

[kaza] house and [kasa] hunt. Later, [s] and [z] merged into [s] making house and hunt homophones:

[kasa]. However, in Judeo Spanish these two phones are thought to remain distinct. To test this

hypothesis Bradley and Delforge (2006) recorded a Judeo Spanish speaker saying 72 words that

historically had [s], and another 72 that historically had [z]. They analyzed the words and categorized

the pronunciations as either voiced, partially voiced, or voiceless. Their results appear in Table X.

Table 4.8. Results of Bradley and Delforge's (2006) study on sibilant voicing in a Judeo Spanish

speaker.

Historically voiceless /s/ Historically voiced /z/

# of voiceless realizations 49 2

# of partially voiced realizations 6 6

# of fully voiced realizations 17 64

1 Enter the data in Table 4.8 into SPSS. These data result in a 2 x 3 Chi-square.

2 Follow the above instructions for performing a 2 x 2 analysis and perform a Chi-square in

SPSS.

3 Examine the observed counts of voiced, partially voiced, and voiceless pronunciations.

Compare them with the expected counts SPSS calculates. How do they help prove or disprove

Bradley and Delforge's hypothesis?

4 Are the results significant?

5 What does the Cramer's V statistic tell you?

Page 15: Chi square - BYU Linguistics & English Languagelinguistics.byu.edu/faculty/deddingt/604/chisquare.pdf · Chi square Chi-square uses categorical data. It looks at how many things fall

6 What do the standardized residuals tell you?

7 How would you report the results of this Chi-square in standard format?

8 Does this analysis meet the assumptions of Chi-square?

/r/ to /R/ in Canadian French

Canadian French is undergoing a change from /r/ to /R/ (Sankoff and Blondeau 2007). Speakers

who retain [r] are considered traditionalist speakers, while speakers who use /R/ are early adopters.

They claim that “eight of ten early adopters are female; and seven of ten traditionalists are male, a

statistically significant difference. (571)”

1 Enter these data into SPSS.

2 What is the research question the data intend to answer?

3 What statistical analysis is appropriate for these data? Why?

4 Are the results significant?

5 What does Cramer's V and the standardized residuals tell you?

6 How would you report the results of this analysis?