© 2005 thomson/south-western slide 1 · 2 © 2005 thomson/south-western slide 4 hypothesis...

15
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University 2 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence Goodness of Fit Test: A Multinomial Population Goodness of Fit Test: Poisson and Normal Distributions Test of Independence 3 Slide © 2005 Thomson/South-Western Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population 1. Set up the null and alternative hypotheses. 2. Select a random sample and record the observed frequency, f i , for each of the k categories. 3. Assuming H 0 is true, compute the expected frequency, e i , in each category by multiplying the category probability by the sample size.

Upload: others

Post on 11-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

1

1Slide© 2005 Thomson/South-Western

Slides Prepared byJOHN S. LOUCKS

St. Edward’s University

2Slide© 2005 Thomson/South-Western

Chapter 12Tests of Goodness of Fit and Independence

Goodness of Fit Test: A Multinomial Population

Goodness of Fit Test: Poissonand Normal Distributions

Test of Independence

3Slide© 2005 Thomson/South-Western

Hypothesis (Goodness of Fit) Testfor Proportions of a Multinomial Population

1. Set up the null and alternative hypotheses.2. Select a random sample and record the observed

frequency, fi , for each of the k categories.

3. Assuming H0 is true, compute the expectedfrequency, ei , in each category by multiplying thecategory probability by the sample size.

Page 2: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

2

4Slide© 2005 Thomson/South-Western

Hypothesis (Goodness of Fit) Testfor Proportions of a Multinomial Population

χ22

1=

−∑=

( )f ee

i i

ii

kχ2

2

1=

−∑=

( )f ee

i i

ii

k

4. Compute the value of the test statistic.

Note: The test statistic has a chi-square distributionwith k – 1 df provided that the expected frequenciesare 5 or more for all categories.

fi = observed frequency for category iei = expected frequency for category ik = number of categories

where:

5Slide© 2005 Thomson/South-Western

Hypothesis (Goodness of Fit) Testfor Proportions of a Multinomial Population

where α is the significance level andthere are k - 1 degrees of freedom

p-value approach:

Critical value approach:

Reject H0 if p-value < α

5. Rejection rule:

2 2αχ χ≥2 2αχ χ≥Reject H0 if

6Slide© 2005 Thomson/South-Western

Multinomial Distribution Goodness of Fit Test

Example: Finger Lakes Homes (A)Finger Lakes Homes manufactures

four models of prefabricated homes,a two-story colonial, a log cabin, asplit-level, and an A-frame. To helpin production planning, managementwould like to determine if previous customer purchases indicate that thereis a preference in the style selected.

Page 3: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

3

7Slide© 2005 Thomson/South-Western

Split- A-Model Colonial Log Level Frame# Sold 30 20 35 15

The number of homes sold of eachmodel for 100 sales over the past twoyears is shown below.

Multinomial Distribution Goodness of Fit Test

Example: Finger Lakes Homes (A)

8Slide© 2005 Thomson/South-Western

Hypotheses

Multinomial Distribution Goodness of Fit Test

where:pC = population proportion that purchase a colonialpL = population proportion that purchase a log cabinpS = population proportion that purchase a split-levelpA = population proportion that purchase an A-frame

H0: pC = pL = pS = pA = .25Ha: The population proportions are not

pC = .25, pL = .25, pS = .25, and pA = .25

9Slide© 2005 Thomson/South-Western

Rejection Rule

χ2

7.815

Do Not Reject H0 Reject H0

Multinomial Distribution Goodness of Fit Test

With α = .05 andk - 1 = 4 - 1 = 3

degrees of freedom

Reject H0 if p-value < .05 or χ2 > 7.815.

Page 4: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

4

10Slide© 2005 Thomson/South-Western

Expected Frequencies

Test Statistic

χ22 2 2 230 25

2520 25

2535 25

2515 25

25=

−+

−+

−+

−( ) ( ) ( ) ( )χ22 2 2 230 25

2520 25

2535 25

2515 25

25=

−+

−+

−+

−( ) ( ) ( ) ( )

Multinomial Distribution Goodness of Fit Test

e1 = .25(100) = 25 e2 = .25(100) = 25e3 = .25(100) = 25 e4 = .25(100) = 25

= 1 + 1 + 4 + 4 = 10

11Slide© 2005 Thomson/South-Western

Multinomial Distribution Goodness of Fit Test

Conclusion Using the p-Value Approach

The p-value < α . We can reject the null hypothesis.

Because χ2 = 10 is between 9.348 and 11.345, thearea in the upper tail of the distribution is between.025 and .01.

Area in Upper Tail .10 .05 .025 .01 .005χ2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838

12Slide© 2005 Thomson/South-Western

Conclusion Using the Critical Value Approach

Multinomial Distribution Goodness of Fit Test

We reject, at the .05 level of significance,the assumption that there is no home stylepreference.

χ2 = 10 > 7.815

Page 5: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

5

13Slide© 2005 Thomson/South-Western

Test of Independence: Contingency Tables

e i jij =

(Row Total)(Column Total) Sample Size

e i jij =

(Row Total)(Column Total) Sample Size

1. Set up the null and alternative hypotheses.2. Select a random sample and record the observed

frequency, fij , for each cell of the contingency table.3. Compute the expected frequency, eij , for each cell.

14Slide© 2005 Thomson/South-Western

Test of Independence: Contingency Tables

χ22

=−

∑∑( )f e

eij ij

ijjiχ2

2

=−

∑∑( )f e

eij ij

ijji

5. Determine the rejection rule.

Reject H0 if p -value < α or . 2 2αχ χ≥2 2αχ χ≥

4. Compute the test statistic.

where α is the significance level and,with n rows and m columns, there are(n - 1)(m - 1) degrees of freedom.

15Slide© 2005 Thomson/South-Western

Each home sold by Finger LakesHomes can be classified according toprice and to style. Finger Lakes’manager would like to determine ifthe price of the home and the style ofthe home are independent variables.

Contingency Table (Independence) Test

Example: Finger Lakes Homes (B)

Page 6: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

6

16Slide© 2005 Thomson/South-Western

Price Colonial Log Split-Level A-Frame

The number of homes sold foreach model and price for the past twoyears is shown below. For convenience,the price of the home is listed as either$99,000 or less or more than $99,000.

> $99,000 12 14 16 3< $99,000 18 6 19 12

Contingency Table (Independence) Test

Example: Finger Lakes Homes (B)

17Slide© 2005 Thomson/South-Western

Hypotheses

Contingency Table (Independence) Test

H0: Price of the home is independent of thestyle of the home that is purchased

Ha: Price of the home is not independent of thestyle of the home that is purchased

18Slide© 2005 Thomson/South-Western

Expected Frequencies

Contingency Table (Independence) Test

Price Colonial Log Split-Level A-Frame Total< $99K> $99K

Total 30 20 35 15 100

12 14 16 3 4518 6 19 12 55

Page 7: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

7

19Slide© 2005 Thomson/South-Western

Rejection Rule

Contingency Table (Independence) Test

2.05 7.815χ =2.05 7.815χ =With α = .05 and (2 - 1)(4 - 1) = 3 d.f.,

Reject H0 if p-value < .05 or χ2 > 7.815

χ22 2 218 16 5

16 56 11

113 6 75

6 75=

−+

−+ +

−( . ).

( ) . . ( . ).

. χ22 2 218 16 5

16 56 11

113 6 75

6 75=

−+

−+ +

−( . ).

( ) . . ( . ).

.

= .1364 + 2.2727 + . . . + 2.0833 = 9.149

Test Statistic

20Slide© 2005 Thomson/South-Western

Conclusion Using the p-Value Approach

The p-value < α . We can reject the null hypothesis.

Because χ2 = 9.145 is between 7.815 and 9.348, thearea in the upper tail of the distribution is between.05 and .025.

Area in Upper Tail .10 .05 .025 .01 .005χ2 Value (df = 3) 6.251 7.815 9.348 11.345 12.838

Contingency Table (Independence) Test

21Slide© 2005 Thomson/South-Western

Conclusion Using the Critical Value Approach

Contingency Table (Independence) Test

We reject, at the .05 level of significance,the assumption that the price of the home isindependent of the style of home that ispurchased.

χ2 = 9.145 > 7.815

Page 8: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

8

22Slide© 2005 Thomson/South-Western

Goodness of Fit Test: Poisson Distribution

1. Set up the null and alternative hypotheses.H0: Population has a Poisson probability distributionHa: Population does not have a Poisson distribution

3. Compute the expected frequency of occurrences eifor each value of the Poisson random variable.

2. Select a random sample anda. Record the observed frequency fi for each value of

the Poisson random variable.b. Compute the mean number of occurrences µ.

23Slide© 2005 Thomson/South-Western

Goodness of Fit Test: Poisson Distribution

χ22

1=

−∑=

( )f ee

i i

ii

kχ2

2

1=

−∑=

( )f ee

i i

ii

k

4. Compute the value of the test statistic.

fi = observed frequency for category iei = expected frequency for category ik = number of categories

where:

24Slide© 2005 Thomson/South-Western

where α is the significance level andthere are k - 2 degrees of freedom

p-value approach:

Critical value approach:

Reject H0 if p-value < α

5. Rejection rule:

2 2αχ χ≥2 2αχ χ≥Reject H0 if

Goodness of Fit Test: Poisson Distribution

Page 9: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

9

25Slide© 2005 Thomson/South-Western

Example: Troy Parking GarageIn studying the need for an

additional entrance to a city parking garage, a consultant has recommended an analysisapproach that is applicable only in situations where the number of carsentering during a specified time period follows aPoisson distribution.

Goodness of Fit Test: Poisson Distribution

26Slide© 2005 Thomson/South-Western

A random sample of 100 one-minute time intervals resultedin the customer arrivals listedbelow. A statistical test mustbe conducted to see if theassumption of a Poisson distribution is reasonable.

Goodness of Fit Test: Poisson Distribution

Example: Troy Parking Garage

# Arrivals 0 1 2 3 4 5 6 7 8 9 10 11 12Frequency 0 1 4 10 14 20 12 12 9 8 6 3 1

27Slide© 2005 Thomson/South-Western

Hypotheses

Goodness of Fit Test: Poisson Distribution

Ha: Number of cars entering the garage during aone-minute interval is not Poisson distributed

H0: Number of cars entering the garage duringa one-minute interval is Poisson distributed

Page 10: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

10

28Slide© 2005 Thomson/South-Western

Estimate of Poisson Probability Function

f x ex

x( )

!=

−6 6f x e

x

x( )

!=

−6 6

Goodness of Fit Test: Poisson Distribution

Τotal Arrivals = 0(0) + 1(1) + 2(4) + . . . + 12(1) = 600

Hence,

Estimate of µ = 600/100 = 6Total Time Periods = 100

29Slide© 2005 Thomson/South-Western

Expected Frequencies

Goodness of Fit Test: Poisson Distribution

x f (x ) nf (x )

0123456

13.7710.33

6.884.132.252.01

100.00

.1377

.1033

.0688

.0413

.0225

.02011.0000

789

101112+

Total

.0025

.0149

.0446

.0892

.1339

.1606

.1606

.251.494.468.92

13.3916.0616.06

x f (x ) nf (x )

30Slide© 2005 Thomson/South-Western

Observed and Expected Frequencies

Goodness of Fit Test: Poisson Distribution

i fi ei fi - ei

-1.201.080.613.94

-4.06-1.77-1.331.121.61

6.208.92

13.3916.0616.0613.7710.33

6.888.39

51014201212

98

10

0 or 1 or 23456789

10 or more

Page 11: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

11

31Slide© 2005 Thomson/South-Western

Test Statistic

χ −= + + + =

2 2 22 ( 1.20) (1.08) (1.61) . . . 3.268

6.20 8.92 8.39χ −

= + + + =2 2 2

2 ( 1.20) (1.08) (1.61) . . . 3.2686.20 8.92 8.39

Goodness of Fit Test: Poisson Distribution

With α = .05 and k - p - 1 = 9 - 1 - 1 = 7 d.f.(where k = number of categories and p = numberof population parameters estimated), 2

.05 14.067χ =2.05 14.067χ =

Reject H0 if p-value < .05 or χ2 > 14.067.

Rejection Rule

32Slide© 2005 Thomson/South-Western

Conclusion Using the p-Value Approach

The p-value > α . We cannot reject the null hypothesis. There is no reason to doubt the assumption of a Poisson distribution.

Because χ2 = 3.268 is between 2.833 and 12.017 in the Chi-Square Distribution Table, the area in the upper tailof the distribution is between .90 and .10.

Area in Upper Tail .90 .10 .05 .025 .01 χ2 Value (df = 7) 2.833 12.017 14.067 16.013 18.475

Goodness of Fit Test: Poisson Distribution

33Slide© 2005 Thomson/South-Western

Goodness of Fit Test: Normal Distribution

1. Set up the null and alternative hypotheses.

3. Compute the expected frequency, ei , for each interval.

2. Select a random sample anda. Compute the mean and standard deviation.b. Define intervals of values so that the expected

frequency is at least 5 for each interval. c. For each interval record the observed frequencies

Page 12: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

12

34Slide© 2005 Thomson/South-Western

4. Compute the value of the test statistic.

Goodness of Fit Test: Normal Distribution

χ22

1=

−∑=

( )f ee

i i

ii

kχ2

2

1=

−∑=

( )f ee

i i

ii

k

5. Reject H0 if (where α is the significance leveland there are k - 3 degrees of freedom).

2 2αχ χ≥2 2αχ χ≥

35Slide© 2005 Thomson/South-Western

Normal Distribution Goodness of Fit Test

Example: IQ Computers

IQIQ Computers (one better than HP?)

manufactures and sells a generalpurpose microcomputer. As part ofa study to evaluate sales personnel, managementwants to determine, at a .05 significance level, if theannual sales volume (number of units sold by asalesperson) follows a normal probability distribution.

36Slide© 2005 Thomson/South-Western

A simple random sample of 30 ofthe salespeople was taken and theirnumbers of units sold are below.

Normal Distribution Goodness of Fit Test

Example: IQ Computers

(mean = 71, standard deviation = 18.54)

33 43 44 45 52 52 56 58 63 6464 65 66 68 70 72 73 73 74 7583 84 85 86 91 92 94 98 102 105

IQ

Page 13: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

13

37Slide© 2005 Thomson/South-Western

Hypotheses

Normal Distribution Goodness of Fit Test

Ha: The population of number of units solddoes not have a normal distribution withmean 71 and standard deviation 18.54.

H0: The population of number of units soldhas a normal distribution with mean 71and standard deviation 18.54.

38Slide© 2005 Thomson/South-Western

Interval Definition

Normal Distribution Goodness of Fit Test

To satisfy the requirement of an expectedfrequency of at least 5 in each interval we willdivide the normal distribution into 30/5 = 6equal probability intervals.

39Slide© 2005 Thomson/South-Western

Interval Definition

Areas= 1.00/6= .1667

7153.0271 − .43(18.54) = 63.03 78.97

88.98 = 71 + .97(18.54)

Normal Distribution Goodness of Fit Test

Page 14: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

14

40Slide© 2005 Thomson/South-Western

Observed and Expected Frequencies

Normal Distribution Goodness of Fit Test

1-210

-11

555555

30

636546

30

Less than 53.0253.02 to 63.0363.03 to 71.0071.00 to 78.9778.97 to 88.98

More than 88.98

i fi ei fi - ei

Total

41Slide© 2005 Thomson/South-Western

2 2 2 2 2 22 (1) ( 2) (1) (0) ( 1) (1) 1.600

5 5 5 5 5 5χ − −

= + + + + + =2 2 2 2 2 2

2 (1) ( 2) (1) (0) ( 1) (1) 1.6005 5 5 5 5 5

χ − −= + + + + + =

Test Statistic

With α = .05 and k - p - 1 = 6 - 2 - 1 = 3 d.f.(where k = number of categories and p = numberof population parameters estimated), 2

.05 7.815χ =2.05 7.815χ =

Reject H0 if p-value < .05 or χ2 > 7.815.

Rejection Rule

Normal Distribution Goodness of Fit Test

42Slide© 2005 Thomson/South-Western

Normal Distribution Goodness of Fit Test

Conclusion Using the p-Value Approach

The p-value > α . We cannot reject the null hypothesis. There is little evidence to support rejecting the assumption the population is normally distributed with µ = 71 and σ = 18.54.

Because χ2 = 1.600 is between .584 and 6.251 in the Chi-Square Distribution Table, the area in the upper tailof the distribution is between .90 and .10.

Area in Upper Tail .90 .10 .05 .025 .01 χ2 Value (df = 3) .584 6.251 7.815 9.348 11.345

Page 15: © 2005 Thomson/South-Western Slide 1 · 2 © 2005 Thomson/South-Western Slide 4 Hypothesis (Goodness of Fit) Test for Proportions of a Multinomial Population χ2 2 1 = − ∑ =

15

43Slide© 2005 Thomson/South-Western

End of Chapter 12