chi square goodness of fit

Post on 20-Jun-2015

197 Views

Category:

Education

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Chi square goodness of fit

TRANSCRIPT

What is a Chi-Square Test of Goodness of Fit?

Questions of goodness of fit have become increasingly important in modern statistics.

Questions of goodness of fit juxtapose complex observed patterns against hypothesized or previously observed patterns

to test overall and specific differences among

them.

Observed Hypothesized Difference

Observed Hypothesized Difference

If the difference is small then the FIT IS GOOD

Observed Hypothesized Difference

If the difference is small then the FIT IS GOOD

Observed Hypothesized Difference

Observed Hypothesized Difference

If the difference is small then the FIT IS GOOD

Observed Hypothesized Difference

For example:

Observed Hypothesized Difference

If the difference is small then the FIT IS GOOD

Observed Hypothesized Difference

For example:

51% Females 50% Females 1%

Observed Hypothesized Difference

Observed Hypothesized Difference

If the difference is BIG then the FIT IS NOT GOOD

Observed Hypothesized Difference

If the difference is BIG then the FIT IS NOT GOOD

Observed Hypothesized Difference

Observed Hypothesized Difference

If the difference is BIG then the FIT IS NOT GOOD

Observed Hypothesized Difference

For example:

Observed Hypothesized Difference

If the difference is BIG then the FIT IS NOT GOOD

Observed Hypothesized Difference

For example:

50% Females 22% Females

18%

Here is an example:

Here is an example:We want to know if a sample we have selected has the national percentages of a certain ethnic groups.

Here is an example:We want to know if a sample we have selected has the national percentages of a certain ethnic groups.

2% of sample is made of

members of this ethnic

group

10% of the population is made of this ethnic group

8% Difference

You will use certain statistical methods to determine if the goodness of fit is

significant or not.

You will use certain statistical methods to determine if the goodness of fit is

significant or not.

Here is an example:

You will use certain statistical methods to determine if the goodness of fit is

significant or not.

Here is an example:Problem – The chair of a statistics department suspects that some of her faculty are more popular with students than others.

There are three sections of introductory stats that are taught at the same time in the morning by Professors Cauforek, Kerr, and Rector.

There are three sections of introductory stats that are taught at the same time in the morning by Professors Cauforek, Kerr, and Rector.66 students are planning on enrolling in one of the three classes.

What would you expect the number of enrollees to be in each class if popularity were not an issue?

Professor Cauforek Professor Kerr Professor Rector

22 22 22

What would you expect the number of enrollees to be in each class if popularity were not an issue?

Professor Cauforek Professor Kerr Professor Rector

22 22 22

What would you expect the number of enrollees to be in each class if popularity were not an issue?

This is our expected value.

Now let’s see what was observed.

Now let’s see what was observed.The number who enroll for each class was:

Now let’s see what was observed.The number who enroll for each class was:

Professor Cauforek Professor Kerr Professor Rector

31 25 10

We will test the degree to which the observed data...

We will test the degree to which the observed data...

Professor Cauforek Professor Kerr Professor Rector

31 25 10

We will test the degree to which the observed data...

…fits the expected enrollments.

Professor Cauforek Professor Kerr Professor Rector

31 25 10

We will test the degree to which the observed data...

…fits the expected enrollments.

Professor Cauforek Professor Kerr Professor Rector

31 25 10

Professor Cauforek Professor Kerr Professor Rector

22 22 22

Here is the formula:

Here is the formula:

𝑥2=Σ(𝑂−𝐸)2

𝐸

Where:

𝑥2=Σ(𝑂−𝐸)2

𝐸

Where:

𝑥2=Σ(𝑂−𝐸)2

𝐸

𝒙𝟐= h𝐶 𝑖𝑆𝑞𝑢𝑎𝑟𝑒

Where:

𝑥2=Σ(𝑂−𝐸)2

𝐸

𝒙𝟐= h𝐶 𝑖𝑆𝑞𝑢𝑎𝑟𝑒

𝒙𝟐=Σ(𝑂−𝐸)2

𝐸

𝚺=𝑆𝑢𝑚𝑜𝑓

𝚺=𝑆𝑢𝑚𝑜𝑓

𝑥2=𝚺 (𝑂−𝐸)2

𝐸

𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒

𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑶−𝐸)2

𝐸

𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑶−𝐸)2

𝐸

Professor Cauforek Professor Kerr Professor Rector

31 25 10

𝐎=𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑶−𝐸)2

𝐸

Professor Cauforek Professor Kerr Professor Rector

31 25 10

𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑂−𝑬 )2

𝐸

𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑂−𝑬 )2

𝐸

Professor Cauforek Professor Kerr Professor Rector

22 22 22

𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑂−𝑬 )2

𝐸

Professor Cauforek Professor Kerr Professor Rector

22 22 22

𝑬=𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑠𝑐𝑜𝑟𝑒

𝑥2=Σ(𝑂−𝐸)2

𝑬

Professor Cauforek Professor Kerr Professor Rector

22 22 22

Here is the null-hypothesis:

Here is the null-hypothesis:

There is no significant difference between the expected and the observed number of students

enrolled in three stats professors’ classes.

Now we will compute the value and compare it with the critical value.

Now we will compute the value and compare it with the critical value.• If the value exceeds the critical value, then we

will reject the null-hypothesis.

Now we will compute the value and compare it with the critical value.• If the value exceeds the critical value, then we

will reject the null-hypothesis.• If the value DOES NOT exceed the critical

value, then we will fail to reject the null-hypothesis.

Let’s compute the value.

Let’s compute the value. Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

Let’s compute the value. Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=𝚺 (𝑂−𝐸)2

𝐸

Let’s compute the value.

OR

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=𝚺 (𝑂−𝐸)2

𝐸

Let’s compute the value.

OR

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=𝚺 (𝑂−𝐸)2

𝐸

𝑥2=(𝑂−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸+

(𝑂−𝐸)2

𝐸

Let’s compute the value.

OR

𝑥2=(𝑂−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸+

(𝑂−𝐸)2

𝐸

𝑥2=𝚺 (𝑂−𝐸)2

𝐸

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

Let’s input each professor’s data into the equation.

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(𝟑𝟏−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(31−𝟐𝟐)2

𝐸+(𝑂−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(31−22)2

𝟐𝟐+

(𝑂−𝐸)2

𝐸+

(𝑂−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(31−22)2

22+

(𝟐𝟓−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(31−22)2

22+

(25−𝟐𝟐)2

𝟐𝟐+(𝑂−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=(31−22)2

22+

(25−22)2

22+(𝟏𝟎−𝐸)2

𝐸

Let’s input each professor’s data into the equation.

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22Observed 31 25 10

𝑥2=(31−22)2

22+

(25−22)2

22+(10−𝟐𝟐)2

𝟐𝟐

Now for the calculation:

Now for the calculation:

𝑥2=(31−22)2

22+

(25−22)2

22+(10−22)2

22

Now for the calculation:

𝑥2=(𝟗)2

22+

(25−22)2

22+(10−22)2

22

Now for the calculation:

𝑥2=𝟖𝟏22

+(25−22)2

22+(10−22)2

22

Now for the calculation:

𝑥2=8122

+(𝟑)2

22+(10−22)2

22

Now for the calculation:

𝑥2=8122

+ 𝟗22

+(10−22)2

22

Now for the calculation:

𝑥2=8122

+ 𝟗22

+(−𝟏𝟐)2

22

Now for the calculation:

𝑥2=8122

+922

+𝟏𝟒𝟒22

Convert the fractions into decimals:

𝑥2=8122

+922

+𝟏𝟒𝟒22

Convert the fractions into decimals:

𝑥2=8122

+922

+14422

Convert the fractions into decimals:

𝑥2=𝟑 .𝟕+922

+14422

Convert the fractions into decimals:

𝑥2=3.7+𝟎 .𝟒+14422

Convert the fractions into decimals:

𝑥2=3.7+0.4+𝟔 .𝟓

Sum the terms:

𝑥2=3.7+0.4+6.5

Sum the terms:

𝑥2=10.6

As a contrasting example note what the value would be if the observed and expected values were more similar:

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=(𝑂−𝐸)2

𝐸+(𝑂−𝐸)2

𝐸+

(𝑂−𝐸)2

𝐸

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22Observed 24 22 20

𝑥2=(𝑂−𝟐𝟐)2

𝟐𝟐+(𝑂−𝟐𝟐)2

𝟐𝟐+

(𝑂−𝟐𝟐)2

𝟐𝟐

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=(𝟐𝟒−22)2

22+(𝟐𝟐−22)2

22+(𝟐𝟎−22)2

22

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=(𝟐)2

22+

(𝟎)2

22+

(−𝟐)2

22

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=𝟒22

+𝟎22

+𝟒22

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=𝟎 .𝟐+𝟎 .𝟎+𝟎 .𝟐

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 24 22 20

𝑥2=𝟎 .𝟒

So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).

So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

So the moral of the story is that the closer the expected and observed values are to one another, the smaller the Chi-square value or the greater the goodness of fit (as seen below).

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=𝟏𝟎 .𝟔

On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).

On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

On the other hand, the farther the expected and observed values are from one another the smaller the Chi-square value or the greater the goodness of fit (as seen below).

Professor Cauforek Professor Kerr Professor Rector

Expected 22 22 22

Observed 31 25 10

𝑥2=𝟏𝟎 .𝟔

Now we determine if a of 10.6 exceeds the critical for terms.

To calculate the critical we first must determine the degrees of freedom as well as set the probability level.

To calculate the critical we first must determine the degrees of freedom as well as set the probability level.The probability or alpha level means the probability of a type 1 error we are willing to live with (i.e., this is the probability of being wrong when we reject the null hypothesis).

To calculate the critical we first must determine the degrees of freedom as well as set the probability level.The probability or alpha level means the probability of a type 1 error we are willing to live with (i.e., this is the probability of being wrong when we reject the null hypothesis). Generally this value is 0.5 which is like saying we are willing to be wrong 5 out of 100 times (0.05) before we will reject the null-hypothesis.

Degrees of Freedom are calculated by taking the number of groups and subtracting them by 1. (Three groups minus 1 = 2)

We now have all of the information we need to determine the critical .

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

Where these two values intersect in the table we find the critical .

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:

Where these two values intersect in the table we find the critical .

We now have all of the information we need to determine the critical .We go to the Chi-Square Distribution Table and locate the degrees of freedom.And then we locate the probability or alpha level:

df 0.100 0.050 0.025 1 2.71 3.84 5.02 2 4.61 5.99 7.38 3 6.25 7.82 9.35 4 7.78 9.49 11.14 5 9.24 11.07 12.83 6 10.64 12.59 14.45 7 12.02 14.07 16.10 8 13.36 15.51 17.54 9 14.68 16.92 19.20 … … … …

Where these two values intersect in the table we find the critical .

Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:

Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:

There is no significant difference between the expected and the observed number of students

enrolled in three stats professors’ classes.

Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:

There is no significant difference between the expected and the observed number of students

enrolled in three stats professors’ classes.

Since the chi-square goodness of fit value (10.6) exceeds the critical (5.99) we will reject the null hypothesis:

There actually is a significant difference.

There is no significant difference between the expected and the observed number of students

enrolled in three stats professors’ classes.

In summary,

In summary,Questions of goodness of fit juxtapose observed patterns against hypothesized to test overall and specific differences among them.

In summary,Questions of goodness of fit juxtapose observed patterns against hypothesized to test overall and specific differences among them.

Observed Hypothesized Difference

top related