overview of experimental research
DESCRIPTION
Chapter 9: The analysis of variance for simple experiments (single factor, unrelated groups designs). Overview of experimental research. Groups start off the same on every measure. During the experiment, groups are TREATED DIFFERENTLY - PowerPoint PPT PresentationTRANSCRIPT
Chapter 9: The analysis of variance for simple experiments (single factor, unrelated groups
designs).
Overview of experimental research
• Groups start off the same on every measure. • During the experiment, groups are TREATED
DIFFERENTLY• Responses thought to be effected by the different
treatments are then measured• If the group means become different from each other,
the differences may have been caused, in part, by the different ways the groups were treated.
• Determining whether the differences between group means result simply from sampling fluctuation or are (probably) due in part to the treatment differences is the job of the statistical analysis.
Let’s take that one point at a time. At the beginning of an experiment:
• Participants are randomly selected from a population. Then they are randomly assigned to treatment groups.
• Thus, at the beginning of the study, each treatment group is a random (sub)sample from a specific population.
Groups start off much the same in every possible way
• Since each treatment group is a random sample from the population, each group’s mean and variance will be similar to that of the population.
• That is, each group’s mean will be a best estimate of mu, the population mean.
• And the spread of scores around each group’s mean will yield a best estimate of sigma2 and sigma.
So: At the beginning of an experiment the treatment groups differ only because of random sampling fluctuation.When there are different people in each group, the random sampling fluctuation is caused by 1.) random individual differences and 2.) random measurement problems.
Sampling fluctuation is the product of the inherent variability of the data.
That is what is indexed by sigma2, the average squared distance of scores
from the population mean, mu.
To summarize:• Since the group means and variances of
random samples will be similar to that of the population, they will be similar to each other.
• This is true for any and all things you can measure.
• The only differences among the groups at the beginning of the study on any and all measures will be the mostly minor differences associated with random sampling fluctuation caused by the fact that there are different people in each group and that there are always random measurement problems (ID + MP).
The ultimate question
• If we then treat the groups differently, will the treatments make the groups more different from each other at the end of the experiment than if only sampling fluctuation created their differences?
In the simplest experiments (Ch 9)
• In the simplest experiments, the groups are exposed to treatments that vary on a single dimension.
• The dimension on which treatments of the groups vary is called the independent variable.
• We call the specific ways the groups are treated the “levels of the independent variable.”
The independent variable• An independent variable can be any preplanned difference in
the way groups are treated. Which kind of difference you chose relates to the experimental hypothesis, H1.
• For example, if you think you have a new medication for bipolar disorder, you would compare the effect of various doses of the new drug to placebo in a random sample of bipolar patients. Thus, the groups would differ in terms of the dose of drug.
• Proper experimental design would ensure that the differences in dose received is the only way the groups will be systematically treated differently from each other.
Why is it called the “independent variable”?Remember, we call the different treatments the “levels”
of the independent variable.
Who gets which level is random. It is determined solely by the group to which a participant is randomly assigned.
• So, any difference in the way a person is treated during the experiment is unrelated to or “independent of” the infinite number of pre-existing differences that precluded causal statements in correlational research.
The dependent variable
• Relevant responses (called dependent variables) are then measured to see whether the independent variable caused differences among the treatment conditions beyond those expected given ordinary sampling fluctuation.
• That is, we want to see whether response are related to (dependent on) the different levels of the independent variable to which the treatment groups were exposed.
Differences after the experiment among group means on the dependent variable
may well be simple sampling fluctuation!• The groups will always differ somewhat from each other
on anything you measure due to sampling fluctuation.• With 3 groups, one will score highest, one lowest and one
in the middle just by chance. In four groups one will score highest, one lowest, with two in the middle, one higher than the other. Etc.
• So the simple fact that the groups differ somewhat is not enough to determine that the independent variable, the different ways the groups were treated, caused the differences.
• We have to determine whether the groups are more different than they should be if only sampling fluctuation is at work.
H0 & H1: If one is wrong, the other must be right.
• Either the independent variable would cause differences in responses (the dependent variable) in the population as a whole or it would not.
• H0: The different conditions embodied by the independent variable would have NO EFFECT if administered to the whole population.
• H1: The different conditions embodied by the independent variable would produce different responses if administered to the whole population
The population can be expected to respond to the different levels of the IV similarly to the samples• Remember, random samples are
representative of the population from which they are drawn.
• If the different levels of the independent variable cause the groups to differ (more than they would from simple sampling fluctuation), the same thing should be true for the rest of the population.
For example:
• Say a new psychotherapy causes a random sample of anxious patients to become less anxious in comparison to treatment groups given more conventional approaches or pill placebo.
• Then, we would expect all anxious patients to respond better to the new treatment than to the ones to which it was compared.
However:
• As in the case of correlation, we don’t want to toss out treatments that we know work because the new treatment happens to do better in an experiment.
• We would want to be sure that the difference after treatment is not just a chance finding based on random sampling fluctuation.
The Null Hypothesis
• The null hypothesis (H0) states that the only reason that the treatment group means are different is sampling fluctuation. It says that the independent variable causes no systematic differences among the groups.
• A corollary: Try the experiment again and a different group will score highest, another lowest. If that is so, you should not generalize from which group in your study scored highest or lowest to the population from which the samples were drawn.
• Your best prediction remains that everyone will score at the mean on the dependent variable, that treatment condition will not predict response.
People often respond to random sampling fluctuation as if something was
causing a difference.• People take all kinds of food supplements because
they believe the supplements (.e.g.echinesia) will make colds go away more quickly.
• If you tried it, and it worked wouldn’t you tell your friends? Wouldn’t you try it again with your next cold?
• Having recovered quickly after taking something provides the evidence. After all, its what happened to you!
But did the food supplement really make a difference?
• To this point NO food supplement has proved to shorten colds when carefully tested.
• The mistake lay in taking random variation in the duration of a cold as evidence that the echinesia (or whatever) did something beneficial.
• That’s ok if it is just your pocket book that is affected. But what if you were an FDA scientist? Wouldn’t people expect better evidence of efficacy before they gave the food supplement company an enormous amount of their money?
We call rejecting a true null hypothesis a “Type 1 Error.”
• The first rule in science is “Do not increase error.”
• Scientists don’t like to say something will make a difference when it isn’t true.
• So, before we toss away proven treatments or say that something will cause illness or health, we want to be fairly sure that we are not just responding to sampling fluctuation.
The scientist’s answer: test the null hypothesis
• So, as we did with correlation and regression, we assume that everything is equal, all treatments have the same effect, unless we can prove otherwise.
• The null hypothesis says that the treatments do not systematically differ; one is as good as another.
• As usual, we test the null hypothesis by asking it to make a prediction and then establishing a range of results for the test statistic consistent with that prediction.
• As usual, that range is a 95% CI for the test statistic.
The test statistic: F and t tests
• In Chapter 8, you learned to use Pearson’s r as a test statistic.
• When it fell outside a 95% confidence interval consistent with the null hypothesis, we rejected the null.
• In experimental research, we generally use the F and t statistics to test the null.
• When there are only two groups, t is used as the test statistic.When there are three or more groups, Fischer’s ratio (called the F statistic) is used as the test statistic.
Nonsignificant results• Each actual t or F will either fall inside or outside the
CI.95 that is consistent with the null hypothesis.• Results inside the range consistent with the null are
called nonsignificant. Results outside the 95% CI are called significant. One or the other must occur in each statistical analysis.
• If you get nonsignificant results, you have failed to reject the null and you may not extrapolate from the differences among your experimental (treatment) groups to the population.
You must go back to saying that your best prediction is that everyone will be equal and the differences among the treatments don’t matter.
If t or F falls outside the CI.95, you have statistically significant findings.
• If your results are statistically significant, then the results are not consistent with the notion that the between group differences are solely the product of sampling fluctuation.
• Since that is what the null says, you must declare the null false and reject it.
• If the experiment is well run, the differences in the way you treated the groups will be the only systematic difference among the groups.
• Getting statistically significant findings is important.
• If you get them, you must say, as a scientist, that the responses of the different treatment groups should be mirrored by the population as a whole were it exposed to the same conditions.
• Scientists tend to be cautious with making such statements, bracketing them with “more research is necessary” type phrases.
• But they still have to say it.
The Experimental Hypothesis (H1)
• Unlike the null, H1 is different in each experiment.
• The experimental hypothesis tells us the way(s) we must treat the groups differently and what to measure.
• Therefore, the experimental hypothesis tells us (in broad terms) how to design the experiment.
• For example, if we hypothesize that embarrassed people remember sad things better, we need to embarrass different groups to different degrees (not at all to a lot) and measure their memories for sad and happy events.
The Experimental Hypothesis
• The experimental hypothesis (H1) states that between group differences on the dependent variable are caused by the independent variable as well as by sampling fluctuation.
• If F or t is significant and the null is shown to be false, and the only systematic difference among the groups is how they were treated (the differing levels of the IV), then H1 must be right.
• In that case, we must extrapolate our findings to the rest of the population, assuming that they would respond as did our different treatment groups.
The F test• In order to statistically test the null hypothesis,
we are going to ask it to make a prediction about the relationship between two estimates of sigma2.
• In an F test, we compare these two different ways of calculating mean squares to estimate the population variance.
• To estimate sigma2 you always divide a sum of squares by its degrees of freedom.
• Remember, random sampling fluctuation is indexed by sigma2, the population variance.
Our two estimates of sigma2
One way to estimate sigma2 is to find the difference between each score and its group mean, square and sum those differences. This yields a sum of squares within group (SSW). To estimate sigma2 you divide SSW by degrees of freedom within group (dfW=n-k). This estimate of sigma2 is called the mean square within groups, MSW. You have been calculating it since Chapter 5.
The other way to estimate sigma2 is to square and sum the differences between each participant’s group mean and the overall mean. This yields a sum of squares between group and grand means (SSB). To estimate sigma2 you divide SSB by degrees of freedom between groups (dfB=k-p1). This is called the mean square between groups,MSB. It is new.
What is indexed by sigma2 and its best estimate:MSW
• Sigma2 indexes random sampling fluctuation. It comprises individual differences and random measurement problems (ID + MP).
• MSW: Since everyone in a specific group is treated the same way, differences between participant’s scores and their own group mean, the basis of MSW, can only reflect ID + MP.
• Thus, MSW is always a good estimate of sigma2, the population variance, as both index ID + MP.
What is indexed by the mean square between groups (MSB)
• Since we treat the groups differently, the distance between each group’s mean and the overall mean can reflect the effects of the independent variable (as well as the effects of random individual differences and random measurement problems).
• Thus MSB = ID + MP + (?)IV• If the independent variable pushes the group
means apart, MSB will overestimate sigma2 and be larger than MSW.
Testing the Null Hypothesis (H0)• H0 says that the IV has no effect. • If H0 is true, groups differ from each other and from the
overall mean only because of sampling fluctuation based on random individual differences and measurement problems (ID + MP).
• These are the same things that make scores differ from their own group means.
• So, according to H0, MSB and MSW are two ways of measuring the same thing (ID + MP) and are both good estimates of sigma2.
• Two measurements of the same thing should be about equal to each other and a ratio between them should be about equal to 1.00.
In simple experiments (Ch.9), the ratio between MSB and MSW is
the Fisher or F ratio.
In simple experiments, F=MSB/MSW.
H0 says F should be about 1.00.
The Experimental Hypothesis (H1)• The experimental hypothesis says that the groups’ means will
be made different from each other (pushed apart) by the IV, the independent variable (as well as by random individual differences and measurement problems).
• If the means are pushed apart, MSB will increase, reflecting the effects of the independent variable (as well as of the random factors). MSW will not.
• So MSB will be larger than MSW
• Therefore, H1 suggests that an F ratio comparing MSB to MSW should be larger than 1.00.
As usual, we set up 95% confidence intervals around the prediction of the null.
• In Ch. 9, the ratio MSB/MSW is called the F ratio.• If the F ratio is about 1.00, the prediction of the null is
correct.• It is rare for the F ratio to be exactly 1.00.• At some point, the ratio gets too different from 1.00 to
be consistent with the null. We are only interested in the case where the ratio is greater than 1.00 which means that the means are further apart than the null suggests.
• The F table tells us when the difference among the means is too large to be explained as sampling fluctuation alone.
Analyzing the results of an experiment
An experiment• Population: Male, self-selected, “social drinkers”• Number of participants (9) and groups (3)• Design: Single factor, unrelated groups• Independent variable: Stress
– Level 1: No Stress– Level 2: Moderate stress– Level 3: High stress
• Dependent variable: ounces consumed• H0: Stress does not affect alcohol consumption.
• H1: Stress will cause increased alcohol consumption.
Computing MSW and MSB
1.11.21.3
2.12.22.3
3.13.23.3
81012
121215
131718
999
000
999
2MX -3-3-3
000
333
MX 101010
131313
161616
131313
131313
131313
MX404
114
914
2XX -202
-1-12
-312
XX X#S101010
131313
161616
X
101 X
132 X
163 X
13M
28WSS
6Wdf
67.4WMS
54BSS
2Bdf
27BMS
CPE 9.2.1 - ANOVA summary table
Between GroupsStress level
Within GroupsError
54 2 27 26 6 4.67
5.78
SS df MS F p
?
We need to look at the F table to determine significance.
Divide MSB by MSW
to calculate F.
Ratio of mean squares = F ratio
W
B
MS
MSMean Squareswithin groups.
Mean Squaresbetween groups.
Possibly effected byindependent variable.
Not effected byindependent variable.
If the independent variable causes differencesbetween the group means, then MSB will belarger than MSW. If the effect is large enough and/or there are enough degrees of freedom, the result may be a statistically significant F ratio.
The F Test• The null predicts that we will find an F ratio close to
1.00, not an unusually large F ratio.• F table tells us whether the F ratio is significant. • p<.05 means that we have found an F ratio that is
large enough to occur with 5 or fewer samples in 100 when the null is true. If we find a larger F ratio than the null predicts, we have shown H0 to predict badly and reject it.
• Results are statistically significant when you equal or exceed the critical value of F at p <.05.
Critical values in the F table
• The critical values in the F table depend on how good MSB and MSW are as estimates of sigma2.
• The better the estimates, the closer to 1.00 the null must predict that their ratio will fall.
• What makes estimates better??? DEGREES OF FREEDOM. Each degree of freedom corrects the sample statistic back towards its population parameter.
• Thus, the more degrees of freedom for MSW and MSB, the closer the critical value of F will be to 1.00.
Using the F tableSo, to use the F table, you must specify the degrees of freedom (df) for the numerator
and denominator of the F ratio.
In both Ch 9 and Ch 10 the denominator is MSW. As you know, dfW = n-k.
In Ch 9, the numerator is MSB and dfB=k-1.
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
36 4.41 3.26 2.86 2.63 2.48 2.36 2.28 2.21 7.39 5.25 4.38 3.89 3.58 3.35 3.18 3.04
40 4.08 3.23 2.84 2.61 2.45 2.34 2.26 2.19 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82
60 4.00 3.15 2.76 2.52 2.37 2.25 2.17 2.10 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82
100 3.94 3.09 2.70 2.46 2.30 2.19 2.10 2.03 6.90 4.82 3.98 3.51 3.20 2.99 2.82 2.69
400 3.86 3.02 2.62 2.39 2.23 2.12 2.03 1.96 6.70 4.66 3.83 3.36 3.06 2.85 2.69 2.55
3.84 2.99 2.60 2.37 2.21 2.09 2.01 1.94 6.64 4.60 3.78 3.32 3.02 2.80 2.64 2.51
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
These are related to the number of different
treatment groups.They relate to theMean Square
between groups.
k-1
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
These are related to the number of subjects.
They relate to theMean Squarewithin groups.
n-k
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
The critical values in thetop rows are alpha = .05.
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
The critical values in thebottom rows are for
bragging rights (p< .01).
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
In an experiment with 3 treatment groups, we have 2 df between groups (k-1).
If we have 9 subjects, and 3 groups, we have 6df within groups (n-k) . Since this is the ratio of MSB to
MSW, the variance estimate between groups must be 5.14 times larger than the variance estimate within groups.
5.14
If we find an F ratio of 5.14 or larger , we reject the null hypothesis and declare that there is a treatment
effect, significant at the .05 alpha level.
ANOVA summary table
Between GroupsStress level
Within GroupsError
54 2 27 26 6 4.67
5.78
SS df MS F p
?
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
5.1410.92
5.78 is as large or larger than the criticalvalue at the .05 alpha level (5.14).
So it is statistically significant. It does not equal or exceed the critical value for .01
F (2,6)=5.78, p<.05
Remember the pattern of critical values in the F table
If the null is true, as df increase, each mean square becomes a better estimate of sigma2 and the null must predict an F ratio closer and closer to 1.00. Whether an F ratio is significant depends on dfW and dfB as well as on the size of the ratio.
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.03
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
Notice the usual effect of consistency
The more dfB and dfW, ...
the better our estimates of sigma2, ...
the closer F should be to 1.00 when the null is true.
H0 says that F should be about 1.00.
Now you do one
An experiment• Population: Female, moderately depressed outpatients• Number of participants (10) and groups (2)• Design: Single factor, unrelated groups• Independent variable: Dose of new drug, Feelbetter
– Level 1: Placebo (n=5)– Level 2: Moderate dose of the drug (n=5)
• Dependent variable: HAM-D scores• H0: Feelbetter does no more good than placebo.
• H1: The average response in the treatment groups will differ more than they would unless Feelbetter effectively helps depression.
Computing MSW and MSB
1.11.21.31.41.5
2.12.22.32.42.5
1821223024
1115131719
1616161616
1618161616
2MX 44444
-4-4-4-4-4
MX 2323232323
1515151515
1919191919
1919191919
MX2541
491
16044
16
2XX -5-2-171
-40-224
XX X#S2323232323
1515151515
X
231 X
152 X 00.19M
120WSS8Wdf
00.15WMS
160BSS
1Bdf
160BMS
ANOVA summary table
Between GroupsDrug dose
Within GroupsError
160 1 160.00 120 8 15.00
10.67
SS df MS F p
?
We need to look at the F table to determine significance.
Divide MSB by MSW
to calculate F.
3 10.13 9.55 9.28 9.12 9.01 8.94 8.88 8.84 34.12 30.82 29.46 28.71 28.24 27.91 27.67 27.49
Degrees of freedom in Numerator 1 2 3 4 5 6 7 8 Df in
denominator
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 21.20 18.00 16.69 15.98 15.52 15.21 14.98 14.80
5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 16.26 13.27 12.06 11.39 10.97 10.67 10.45 10.27
6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 13.74 10.92 9.78 9.15 8.75 8.47 8.26 8.10
7 5.59 4.74 4.35 4.12 3.97 3.87 3.79 3.73 12.25 9.55 8.45 7.85 7.46 7.19 7.00 6.84
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44
11.26 8.65 7.59 7.01 6.63 6.37 6.19 6.039 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 10.56 8.02 6.99 6.42 6.06 5.80 5.62 5.47
ANOVA summary table
Between GroupsDrug dose
Within GroupsError
160 1 160.00 120 8 15.00
10.67
SS df MS F p
.05
F(1,8)=10.67, p<.05.
Now do it as a t test. (t for 2, F for more)
Example 2 – t test summary table
Between GroupsDrug dose
Within GroupsError
160 1 12.65 120 8 3.87
10.67
SS df s s F p
?
We need to look at the t table to determine significance.
df 1 2 3 4 5 6 7 8
.05 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306
.01 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355
df 9 10 11 12 13 14 15 16.05 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120.01 3.250 3.169 3.106 3.055 3.012 2.997 2.947 2.921
df 17 18 19 20 21 22 23 24.05 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064.01 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797
df 25 26 27 28 29 30 40 60.05 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.000.01 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660
df 100 200 500 1000 2000 10000.05 1.984 1.972 1.965 1.962 1.961 1.960.01 2.626 2.601 2.586 2.581 2.578 2.576
Example 2 – t test summary table
Between GroupsDrug dose
Within GroupsError
160 1 12.65 120 8 3.87
10.67
SS df s s F p
.05
t = the square root of F.F and t always give you
the same level of significance.
t(8)=3.266, p<.05.
The t Test: t for 2, F for More
• The t test is a special case of the F ratio.
• If there are only two levels (groups) of the independent variable, then
Ft W
B
MS
MS
s
sB
t table and F table
when there are only two groups:
df in the numerator is always 1
Bdf 1k 12 1
Relationship between t and F tables
• Because there is always 1 df between groups, the t table is organized only by degrees of freedom within group (dfW).
• By the way, the values in the t table are the square root of the values in the first column of the F table.
df 1 2 3 4 5 6 7 8.05 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306.01 63.657 9.925 5.841 4.604 4.032 3.707 3.499 3.355
df 9 10 11 12 13 14 15 16.05 2.262 2.228 2.201 2.179 2.160 2.145 2.131 2.120.01 3.250 3.169 3.106 3.055 3.012 2.997 2.947 2.921
df 17 18 19 20 21 22 23 24.05 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064.01 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797
df 25 26 27 28 29 30 40 60.05 2.060 2.056 2.052 2.048 2.045 2.042 2.021 2.000.01 2.787 2.779 2.771 2.763 2.756 2.750 2.704 2.660
df 100 200 500 1000 2000 10000.05 1.984 1.972 1.965 1.962 1.961 1.960.01 2.626 2.601 2.586 2.581 2.578 2.576
t tablefrom Chapter 6
Let’s look atthese values.
df 3 4 5.05 3.182 2.776 2.571.01 5.841 4.604 4.032
t tablefrom Chapter 6
F table.
3 10.13 34.12
4 7.71 21.20
5 6.61 16.26
1
Ft
2tF
In fact …
• It all fits together!
• The F table is related to the t table; The t table approaches the z table.
• Degrees of freedom!
• Alpha levels!
• Significance!
YOU NEVER TEST THE EXPERIMENTAL HYPOTHESIS
STATISTICALLY.• You can only examine the data in light of the
experimental hypothesis after rejecting the null.
• Good research design makes the experimental hypothesis the only reasonable alternative to the null.
• Accepting the experimental hypothesis is based on good research design and logic, not statistical tests.
We have to go back to H0
• Until I can prove otherwise, we must assume my new drug is no better than placebo.
• If that is the case, all participants can be expected to score about the same, right at the mean of the population.
Are the groups too different from each other for only sampling fluctuation to be at
work?
• The simple fact that the groups differ somewhat is not enough to determine that the independent variable, the different ways the groups were treated, caused any part of the differences between the groups.
• We have to determine whether the groups are more different than they should be if only sampling fluctuation is at work!