hudm4122 probability and statistical inference...• chico is interested in how high action figures...

71
HUDM4122 Probability and Statistical Inference April 27, 2015

Upload: others

Post on 13-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

HUDM4122Probability and Statistical Inference

April 27, 2015

Page 2: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

HW10

Page 3: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 1

• What is the critical value of t(e.g. p=0.05 for two-tailed test),for N=25?

• Correct answer: 2.06– Let’s look at how to find this

• Common wrong answer: 1.71– That’s a one-tailed test

Page 4: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 3• Chico is interested in how high stuffed animals can

jump.A recent article in Stuffed Animal Quarterly suggeststhatthe average stuffed animal can jump 4 inches, and thatthe standard deviation is 4 inches as well.Chico asks his 9 favorite stuffed animals to jump.He finds that they jump an average of 5 inches.

• Are the stuffed animals jumping statisticallysignificantly higher than theaverage printed in Stuffed Animal Quarterly? Enter thep value.(Conduct a two-tailed t-test)

Page 5: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 4

• Chico is interested in how high stuffed animals canjump.A recent article in Stuffed Animal Quarterly suggeststhatthe average stuffed animal can jump 4 inches, and thatthe standard deviation is 4 inches as well.Chico asks his 9 favorite stuffed animals to jump.He finds that they jump an average of 5 inches.

• What is the lower bound on the 95% ConfidenceInterval?(Use the t distribution.)

Page 6: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

You Try It!• Chico is interested in how high action figures can

jump.A recent article in Action Figure Quarterlysuggests thatthe average stuffed animal can jump 2 inches,and thatthe standard deviation is 3 inches.Chico asks his 7 favorite action figures to jump.He finds that they jump an average of 4 inches.

• What is the 95% Confidence Interval?(Use the t distribution.)

Page 7: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 5

8 students with a specific behavioral disorder participate in an interventiondesignedfor their needs, and are observed afterwards.The clinical observation scale goes from 0 to 10, with any score below 3consideredevidence for appropriate behavior.The average clincal observation score in your sample is 1.2, with astandard deviation of 2 points.What is the upper bound on the 95% Confidence Interval?(Give two digits after the decimal)

Lots of negative answers; is this plausible?

Page 8: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 7

• You’re comparing the difference between Bob'sDiscount Math Curriculum and SaxonMathBob's: average grade = 58, standard deviation = 7,sample size = 15SaxonMath: average grade = 62, standarddeviation = 10.5, sample size = 20

Compute a two-tailed t-test to find out whetherthe difference between curricula isstatistically significant. Assume pooled variance.

Page 9: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 9

• You’re comparing the difference between Bob'sDiscount Math Curriculum and JuteMath.

• Bob's: average grade = 58, standard deviation = 7,sample size = 15JuteMath: average grade = 62, standard deviation = 16,sample size = 20

Compute a two-tailed t-test to find out whether thedifference between curricula isstatistically significant. Assume unpooled variance.(Give two digits after the decimal, rounded)

Page 10: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

You Try It!

• You’re comparing the difference between Bob'sDiscount Math Curriculum and PictMath.

• Bob's: average grade = 54, standard deviation = 8,sample size = 12PictMath: average grade = 60, standard deviation= 12, sample size = 18

Compute a two-tailed t-test to find out whetherthe difference between curricula isstatistically significant.

Page 11: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Problem 10• You're comparing

students' scores on themidterm and the final,to see if students didsignificantly worse onthe final than themidterm.Compute a two-tailedpaired t-test to answerthis question.Give two digits after thedecimal, rounded.

Midterm Final0.72 0.720.69 0.670.65 0.660.73 0.680.95 0.920.88 0.880.62 0.720.78 0.71

Page 12: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

You Try It!• You're comparing

students' scores on themidterm and the final,to see if students didsignificantly worse onthe final than themidterm.Compute a two-tailedpaired t-test to answerthis question.Give two digits after thedecimal, rounded.

Midterm Final0.7 0.60.7 0.60.6 0.70.8 0.71.0 0.90.9 0.80.6 0.60.8 0.8

Page 13: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

t(7)=1.87,p=0.10• You're comparing

students' scores on themidterm and the final,to see if students didsignificantly worse onthe final than themidterm.Compute a two-tailedpaired t-test to answerthis question.Give two digits after thedecimal, rounded.

Midterm Final0.7 0.60.7 0.60.6 0.70.8 0.71.0 0.90.9 0.80.6 0.60.8 0.8

Page 14: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Questions? Comments?

Page 15: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Chi-squared (χ2) distribution

Page 16: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

So far…

• We have largely talked about– Comparing quantitative variables

– Is a mean different than 0 (or another criterion value)• Does a curriculum lead to learning?

– Are means different for two samples?• Does curriculum A lead to more learning than curriculum B?

– Are means different for two variables from the samesample?

• Do individual learners do better on the pre-test than thepost-test?

Page 17: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

We have also

• Talked about comparing proportions

Page 18: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

But what if…

• We want to compare two groups, in terms of acategorical variable?

Page 19: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example

• One group of students uses Singapore Math• Another group of students uses Bob’s

Discount Math Curriculum

• The prevalence of different affective states ismeasured using BROMP field observations

• We compare this using a two-way table

Page 20: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

We find

Affective State SingaporeMath

BDMC

BORED 2 5FRUSTRATED 9 14

ENGAGED 20 12CONFUSED 6 9DELIGHTED 13 10

Page 21: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

We want to know

• Is affect significantly different betweenSingapore Math and BDMC?

Page 22: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

We want to know

• Is affect significantly different between SingaporeMath and BDMC?

• : There is no difference in the proportions ofeach affective state, between the two variables

• : There is some difference in the proportionsof each affective state, between the two variables

Page 23: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

How can we test this?

• We compare the actual counts in the table

• To the expected counts we might expect tosee for each variable,if were true

Page 24: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

We can compute this

• , = ∗• Or as the book writes it

• , = ∗

Page 25: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Why we treat this as expected value

• If there really was no difference betweengroups

• Then the overall percentage of cases wherethe student is bored will be the same betweengroups

Page 26: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Why we treat this as expected value

• So we can take the percentage of cases wherethe student is bored

• Multiplied by the percentage of cases that arein the group overall

• Multiplied by the total number of cases

• And that’s the number of cases we wouldexpect the group to be bored

Page 27: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Why we treat this as expected value

• So we can take the percentage of cases wherethe student is bored:

• Multiplied by the percentage of cases that arein the group overall:

• Multiplied by the total number of cases n

• And that’s the number of cases we wouldexpect the group to be bored

Page 28: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

• So we can take the percentage of cases wherethe student is bored:

• Multiplied by the percentage of cases that arein the group overall:

• Multiplied by the total number of cases n

• And that’s the number of cases we wouldexpect the group to be bored

, = = ∗

Page 29: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example

AffectiveState

SingaporeMath

BDMC

BORED 2 5

FRUSTRATED 9 14

ENGAGED 20 12

CONFUSED 6 9

DELIGHTED 13 10

Page 30: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5

FRUSTRATED 9 14

ENGAGED 20 12

CONFUSED 6 9

DELIGHTED 13 10

Column Total

Page 31: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

Page 32: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

Page 33: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

BORED-SM Expected = ∗

Page 34: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

BORED-SM Expected = ∗ = 3.5

Page 35: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

BORED-BDMC Expected = ∗ = 3.5

Page 36: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

FRUSTRATED Expected = ∗ = 11.5

Page 37: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

ENGAGED? You Try It.

Page 38: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

CONFUSED? You Try It.

Page 39: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 32

CONFUSED 15

DELIGHTED 23

Column Total 50 50 100

DELIGHTED? You Try It.

Page 40: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23

Column Total 50 50 100

Now we can compare the two tables

Page 41: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Comparing the observed and expectedcounts

• χ2 = ∑ ( )

Page 42: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23

Column Total 50 50 100

χ2 =( . ). + ( . ). + ( . ). + …

Page 43: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23

Column Total 50 50 100

χ2 = . . + . . + . . + …

Page 44: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Actual Expected

AffectiveState

SingaporeMath

BDMC Row Total

BORED 2 5 7

FRUSTRATED 9 14 23

ENGAGED 20 12 32

CONFUSED 6 9 15

DELIGHTED 13 10 23

Column Total 50 50 100

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23

Column Total 50 50 100

χ2 = 5.36

Page 45: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

How is χ2 distributed?

• It can’t be a Z or t distribution…

• Because all the values are squared, andgreater than 0

Page 46: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

For v degrees of freedom

Image from David Sabo’s webpage,http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section8/onevariance/chisqtable/chisqtable.htm

Page 47: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

When df>=30Distribution is approximated by Z

Image from David Sabo’s webpage,http://commons.bcit.ca/math/faculty/david_sabo/apples/math2441/section8/onevariance/chisqtable/chisqtable.htm

Page 48: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Almost always used one-tail

Image from Philip Ender’webpage,http://www.philender.com/courses/intro/notes3/chi.html

Page 49: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Messy to calculate two-tailed:asymmetric

Page 50: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

df

• Not calculated as a function of n!

• Instead, calculated in terms of number of rowsand columns

• = ( − 1)( − 1)

Page 51: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

df: for this case

• r = 5• c = 2

• df = (5-1)(2-1)• df = 4

AffectiveState

SingaporeMath

BDMC Row Total

BORED 3.5 3.5 7

FRUSTRATED 11.5 11.5 23

ENGAGED 16 16 32

CONFUSED 7.5 7.5 15

DELIGHTED 11.5 11.5 23

Column Total 50 50 100

Page 52: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Use χ2 table

• Or =CHIDIST in Excel

Page 53: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Use χ2 table

• Or =CHIDIST in Excel

• In this case, =CHIDIST(5.36, 4)

Page 54: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Use χ2 table

• Or =CHIDIST in Excel

• In this case, =CHIDIST(5.36, 4)

• Which gives p = 0.25

• So BDMC and Singapore Math are notsignificantly different in terms of affect

Page 55: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Use χ2 table• Or =CHIDIST in Excel

• In this case, =CHIDIST(5.36, 4)

• Which gives p = 0.25

• So BDMC and Singapore Math are notsignificantly different in terms of affect

• Written χ2(df=4,N=100)=5.36, p=0.25

Page 56: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Comments? Questions?

Page 57: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Note

• Columns do not have to have the same totalvalue

Page 58: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

You Try It!

PreferredMovie

TeenageFemales

TeenageMales

Babies

RacecarExplosions

17 44 3

HorsePrincessDiaries

41 13 1

Shiny Things 0 2 71

Conan theLibrarian

4 12 2

Page 59: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

χ2(df=6,N=196)=224, p<0.001

PreferredMovie

TeenageFemales

TeenageMales

Babies

RacecarExplosions

17 44 3

HorsePrincessDiaries

41 13 1

Shiny Things 0 2 71

Conan theLibrarian

4 12 2

Page 60: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

How do we know which category isdifferent?

• After you do the overall χ2 test

• You can then look at individual categories versusall other categories, and do χ2 again

• But, warning – you have to do a post-hocadjustment of α

• Out of scope for this class; see Chapter 5, Video 1of Big Data and Educationhttp://www.columbia.edu/~rsb2162/bigdataeducation.html

Page 61: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

When you can use χ2

• Not usable for very small amounts of data

• Magic number– All expected cell counts should be > 5– (Not everyone uses the same magic number here)

• If a cell count is under 5– You can try combining columns or rows

Page 62: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example:What is your favorite sport?

Sport America IrelandBaseball 25 3Football 42 2Soccer 19 43Rugby 6 29Hurling 0 31

Tiddlywinks 3 0Calvinball 3 5

Page 63: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Example:What is your favorite sport?

Sport America Ireland TotalBaseball 25 3 28Football 42 2 44Soccer 19 43 62Rugby 6 29 35Hurling 0 31 31

Tiddlywinks 3 0 3Calvinball 3 5 8

Total 98 113 211

Page 64: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Some numbers too small!

Sport America Ireland TotalBaseball 25 3 28Football 42 2 44Soccer 19 43 62Rugby 6 29 35Hurling 0 31 31

Tiddlywinks 3 0 3Calvinball 3 5 8

Total 98 113 211

98 ∗ 3211 = 1.3998 ∗ 8211 = 3.72113 ∗ 3211 = 1.61113 ∗ 8211 = 4.23

Page 65: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Solution: Combine Two Rows

Sport America Ireland TotalBaseball 25 3 28Football 42 2 44Soccer 19 43 62Rugby 6 29 35Hurling 0 31 31

Sports RyanHas NeverHeard of 6 5 11

Total 98 113 211

98 ∗ 11211 = 5.11113 ∗ 11211 = 5.89

Page 66: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Questions? Comments?

Page 67: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Many other uses of χ2

• Comparing tables to tables– Looking at changes over time

• Testing whether data is consistent with aspecific distribution

• Computing the confidence interval of astandard deviation

Page 68: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Many other uses of χ2

• Comparing tables to tables– Looking at changes over time

• Testing whether data is consistent with a specificdistribution

• Computing the confidence interval of a standarddeviation– This is discussed in the book in chapter 10.6– It’s rare, and a bit conceptually messy– It’s rare in part because it provides asymmetric

confidence intervals

Page 69: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Final questions for the day?

Page 70: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Review sessions

• Please fill out doodle link by Friday• I will set up review sessions Friday

• We will not be able to use class time for areview session– If anyone can’t make either review session, we can

set up a separate meeting

Page 71: HUDM4122 Probability and Statistical Inference...• Chico is interested in how high action figures can jump. A recent article in Action Figure Quarterly suggests that the average

Upcoming Classes

• 4/29 F test– HW 11 due

• 5/4 Test assumptions

• 5/6 ANOVA– HW 12 due

• 5/11 FINAL EXAM