testing the differences between means statistics for political science levin and fox chapter seven 1
TRANSCRIPT
Testing the Differences between Means
Statistics for Political ScienceLevin and FoxChapter Seven
1
What is hypothesis testing?
When we evaluate sample data collected about a particular population and see how likely the sample results are, given our hypothesis about the population.
If the sample results are plausible under the hypothesis about the population, we accept the hypothesis.
If the sample results are unlikely (less than 5 chances in 100) we then reject the hypothesis (or retain the null) and attribute any departure form our expected results to be pure chance based on sampling error.
2
The Null HypothesisNull Hypothesis:It is the hypothesis that says that two samples have been drawn from equivalent
populations. Any observed difference between samples is a result of chance occurrence resulting from sampling error alone. The difference in sample means does not imply a difference in population means.
To conclude that sampling error is responsible for obtaining a difference between sample means is to retain the null hypothesis:
µ1 = µ2
Where µ1 = mean of the first population µ2 = mean of the second population
To Retain: Does not imply that we have proven the population means are equal, but rather that we lack sufficient evidence to say otherwise (that is, to say they there is a difference between the populations).
3
The Research Hypothesis for Means Difference
Research Hypothesis:
If we reject the null hypothesis, then we automatically accept the research hypothesis that a true population difference does exist. The difference between sample means is too large to be accounted for by sampling error.
The research hypothesis for mean differences is symbolized by (the population means are not equal):
µ1 ≠ µ2
4
Null and Research Hypothesis
Hypothesis: Example
Men are more permissive than women with regards to disciplining children.
Null Hypothesis: The null hypothesis holds that there is no difference between Men and Women (as
populations) when it comes to disciplining children. Any observed difference is the result of sampling error (rather than actual difference).
5
Null and Research Hypothesis
Hypothesis: Example
Men are more permissive than women with regards to disciplining children.
Research Hypothesis: The research hypothesis holds that there IS a difference between between Men and
Women (as populations) when it comes to disciplining children.
6
Sampling Distribution of Differences between Means
Sampling Distribution of Differences between Means:Recall from our long-distance phone calling example, that if a researcher was to
take multiple samples, he/she could get a sampling distribution of means (rather than raw scores).
Paired Samples: What if the researcher, while gathering samples, studies or compares two samples at a time.
7
Sampling Distribution of Differences between Means
Example: Child Rearing: Comparing Males and FemalesTo test the difference, a researcher constructs a scale of permissiveness from 1 (Strict: not very permissive) to 100 (very permissive). Then they study two random samples of 30 men and 30 women.
Results:Women: (sample mean) = 58.0 (more permissive)Men: (sample mean) = 54.0 (less permissive)Difference Between Means: (58.0 – 54.0) = + 4.0
Is the difference the result of chance alone/sampling error (Null hypothesis)? Or is there a difference between men and women (as populations) (Research Hypothesis)?
8
Sampling Distribution of Differences between Means
Example: Child Rearing: Comparing Males and FemalesWhat if the researcher continued to take samples, and took 70 additional pairs
of samples, each containing 30 women and 30 men. This would give us, as it did with the first two paired samples a difference between the means.
Sampling Distribution of Differences between Means:And, just as we did first with raw scores, and then sample means, once we have a distribution of mean differences we can construct a Sampling Distribution of Differences between Means.
9
Sampling Distribution of Differences between MeansExample: Child Rearing: Comparing Males and FemalesWhat if the researcher continued to take samples, and took 70 additional pairs of samples, each containing 30 women and 30 men.
10
Population:µ = ?
Population:µ = ?
5757
Samples: Women (30 in each)
Samples: Men (30 in each)
55 55
59 59
1
2
3
…70
5454
5656
5757
_
_
_
= + 3
= - 1
= + 2Note: You always subtract the second sample mean (men) from the first sample mean (women).
The Purpose and Function of a Sampling Distribution of Differences between Means
Child Rearing: Males and FemalesHere is what it looks like as a
frequency distribution.
Mean DifferenceMean Difference ff
+3+3 11
+2+2 55
+1+1 77
00 1313
-1-1 88
-2-2 44
-3-3 11
N =N = 3535
11
Testing Hypotheses with the Distribution of Differences between Means
Sampling Distribution of Differences between Means:
1) It assumes that all sample pairs differ only by virtue of sample error and not as a function of true population differences.2) The mean of the difference between means equals zero (this is so because the resulting positive and negative numbers tend to cancel each other out. 3) Approximates the normal curve (most of the mean differences fall near zero, which is expected since any difference between means is a product of sampling error.)
12
Testing Hypotheses with the Distribution of Differences between Means
Probability and Sampling Distribution of Differences between Means:Since Sampling Distribution of Differences between Means approximates the
normal curve, we can use the properties of the normal curve to make statements of probability about mean differences, specifically whether it is likely or not that the mean difference is a result of chance/sampling error or true population differences.
13
Testing Hypotheses with the Distribution of Differences between Means
14
Closer to zero, moreCloser to zero, more likely to be sample error likely to be sample error
Further from zero, less Further from zero, less likely to be sample error likely to be sample error
NullNull
ResearchResearch
Probability and Sampling Distribution of Differences between Means:
If the obtained difference between means lies so far from a difference of zero that it has only a small probability of occurrence in the sampling distribution of differences between means, we reject the null hypothesis.
If our sample mean difference falls so close to zero that its probability of occurrence is large, we must retain the null hypothesis and treat the obtained difference as a sampling error.
15
Testing Hypotheses with the Distribution of Differences between Means
Example: Child Rearing: Comparing Males and FemalesWhat if the researcher examines one pair (as opposed to 70 pairs) containing
30 men and 30 women. (Subtract second mean from the first.)
Results:Women: (sample mean) = 45.0 Men: (sample mean) = 40.0Difference Between Means: (45.0 – 40.0) = + 5.0
How far does + 5.0 fall from the mean of zero?
16
So, to determine how far our obtained difference betweens lies from the mean difference of zero, we must translate our obtained difference into units of standard deviation.
Step 1: Recall this formula for standardizing units of deviation: Raw Score
17
Z = X - µ
σ
Where x = raw score
µ = mean of the distribution of raw scores
σ = standard deviation of the distribution of raw scores
Child Rearing: Comparing Males and Females
Step 2a: Use this formula as step in translating the mean scores in a distribution of sample means into units of standard deviation.
18
Z = - µ
σ
Where = sample mean
µ = population mean (mean of means)
σ = standard error of the mean (standard deviation of the distribution of means)
X
X
X
X
Child Rearing: Comparing Males and Females
Step 2b: Translate our sample mean difference into units of standard deviation.
19
Z = ( 1 – 2) - 0
Where = mean of the first sample
= mean of the second sample
0 = zero, the value of the mean of the sampling distribution of differences between means (we assume that µ1 - µ2 = 0)
= standard error of the mean (standard deviation of the distribution of the difference between means)
We can reduce this equation down to the following:
X X
21 XX
1X
2X
21 XX
21
21
XX
XXz
Child Rearing: Comparing Males and Females
Result: (assuming equals 2)
20
Z = ( 45 – 40)
Thus, a difference of 5 between the means of the two samples (women and men) falls 2.5 standard deviations from a mean of zero.
21 XX
2
Z = + 2.5
Child Rearing: Comparing Males and Females
0 0 55
z = 2.50z = 2.50
49.38%49.38%
P =.4938P =.4938 P=.006 P=.006P =.4938P =.4938 P =.012 P =.012
What is the probability that a difference of 5 between sample means could be caused by sampling error?
The probability of getting 5 or move (above or below the mean) because of sample error is roughly P = .01 (1 in a 100). 5 and above P = .006 (.06 in 100).
1.24 % 1.24 % .62 % .62 %
The α (alpha) value is the level of probability at which the null hypothesis can be rejected with confidence and the research hypothesis accepted with confidence.
We decide to reject the null hypothesis if the probability is very small. This is symbolized as
P ≤ .05(P is less than or equal to .05)
22
Levels of SignificanceIs a mean difference of 5, which has a P = .01 chance of resulting from sample error statistically significant, that is, does it result from populationdifference? Levels of Significance: We need to establish this to determine whether or not our obtained sample difference is statistically significant.
Things to Know about Levels of Significance:
A small probability is symbolized by– P ≤ .05
Alpha is generally defined as (95 % Confidence Interval)– α = .05 level of significance
This means that we are willing to reject the null hypothesis if an obtained sample difference occurs by chance less than 5 times out of 100.
Thus, a mean difference of 5, between men women with regards to their approach to child-rearing is statistically significant, and is not the result of sampling error but differences between the populations.
23
Child Rearing: Comparing Males and Females
Thus, a mean difference of 5, between men women with regards to their approach to child-rearing is statistically significant, and is not the result of sampling error but differences between the populations.
24
In this case, the z scores are called critical values.
With α = .05, the z score ±1.96 is a critical value. If we obtain a z score that exceeds 1.96 (z>1.96 or z<-1.96), it is statistically
significant.
Critical or rejection regions are those areas beyond the z score to the tail of the normal curve and scores within these areas lead us to reject the null hypothesis.
25
Critical Values
z = 1.96z = 1.96
95%95%
47.5 %47.5 % 47.5 %47.5 %
Critical Values: Z Score
2.50 % 2.50 %
z = -1.96z = -1.96
2.50 % 2.50 %
Critical Values: Z Score
2.50 % 2.50 % 2.50 % 2.50 %
z= +1.96
z= +1.96
z= -1.96 z= -1.96
If we obtain a z score that exceeds 1.96 it is called statistically significant.
0 0
StatisticallySignificant: reject Null Hypothesis
StatisticallySignificant: reject Null Hypothesis
Statistically Insignificant: accept Null Hypothesis
Significance levels can be set up for any degree of probability.
NOTE: Levels of significance do not give us an absolute statement as to the correctness of the null hypothesis. We can choose to accept or reject the null hypothesis anyway.
28
Significance levels
Type I Errors
Type 1 Error: Rejecting the hypothesis when it should have been retained
For example, if we reject the the null hypothesis at the .05 level of significance and conclude that there are gender differences in child-rearing attitudes, then there are 5 chances out of 100 that we are wrong. Or P = .05 that we committed a Type I error and that gender actually has no effect.
The more stringent our level of significance (the farther out in the tail it lies), the less likely we are to make a Type 1 error.
The probability of a Type I error is represented by α or alpha.
29
Type II Errors
Type II Error: Accepting the null hypothesis when it should have been rejected
The farther out in the tail of the curve that our critical value falls, the greater the risk of a Type II error.
The research hypothesis may still be correct, despite the decision to reject it and retain the null hypothesis.
One method for reducing the risk of a Type II error is to increase the size of the sample so that the true population difference is more likely to be represented.
The probability of a Type II error is β or beta.
30
Error Types: Type I
z= +1.96
z= +1.96
z= -1.96 z= -1.96
Type I: Reject Null, when we should have retained it
0 0
2.50%2.50% 2.50%2.50%
The larger the significance level, and thus % on the tail,
the more likely we are to mistakenly reject the null.
The larger the significance level, and thus % on the tail,
the more likely we are to mistakenly reject the null.
Example: 95% confidence interval, α =.05, z =1.96
Error Types: Type I
z= +258 z=
+258 z= -2.58 z= -2.58
Type I: Reject Null, when we should have retained it
0 0
.5%.5% .5%.5%
The smaller the significance level, and thus % on the tail,
the less likely we are to mistakenly reject the null.
The smaller the significance level, and thus % on the tail,
the less likely we are to mistakenly reject the null.
Example: 99% confidence interval, α =.01, z =2.58
Error Types: Type II
z= +258 z=
+258 z= -2.58 z= -2.58
Type II: Accept Null, when we should have rejected it
0 0
.5%.5% .5%.5%
The smaller the significance level, and thus % on the tail,
the more likely we are to mistakenly accept the null.
The smaller the significance level, and thus % on the tail,
the more likely we are to mistakenly accept the null.
Example: 99% confidence interval, α =.01, z =2.58
Some notes on Type I and Type II Errors
The probabilities of Type I and Type II errors are inversely related.The larger the level of significance, the larger the chance of a Type I error.
We predetermine our level of significance for a hypothesis test depending on which error is the least damaging or costly.
If it would be far worse to reject a true null hypothesis (that is, suggest statistically significance, or different populations where there is none) (Type I error) than to retain a false null hypothesis (to suggest there is no population difference where there is difference) (Type II error), we should use a use a smaller level of significance: α = .01
34
The Difference between P and α
P is the exact probability that the null hypothesis is true in light of some sample data.
Alpha is the threshold below which is considered so small that we decide to reject the null hypothesis.
We reject the null hypothesis if the P value is less than the alpha value.
35
The Difference between P and α
z= +1.96
z= +1.96
z= -1.96 z= -1.96
Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100).
0 0
2.50%2.50%α =.05/z =1.96: 95.0%α =.05/z =1.96: 95.0%
z = 2.50z = 2.50
49.38%49.38%
P =.4938P =.4938
.62 % .62 %
P=.006 P=.006
55
P = .025%P = .025%
The Difference between P and α
z= +1.96
z= +1.96
z= -1.96 z= -1.96
Example: a mean difference of 5 has a P of .006 x2 = roughly .01 (1 in a 100), whereas α = .05 cuts off the null hypothesis at .025 x2 = .05 (5 chances in 100).Any mean difference below5 chances in 100 Supports the researchhypothesis.
0 0
.62%.62%
55
P = .025%P = .025%
2.50%2.50%
P =.006P =.006
StatisticallySignificant