lec5.pdf
TRANSCRIPT
![Page 1: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/1.jpg)
STAT3010: Lecture 5
1
Notation and Examples (Section 9.2, Page 413)
To make a decision of reject/do not reject the null hypothesis, we simplify the test by the use of the ANOVA table. Here are the formula’s which make up the ANOVA table:
Analysis of Variance Table Degrees of
Source of Sums of Squares Freedom Mean Squares Variation (SS) (df) (MS) F
Between 2... )( XXnSS jjb k-1
12
kSSMSs b
bbw
b
MSMSF
Within 2. )( jijw XXSS N-k
kNSSMSs w
ww2
Total 2.. )( XXSS ijtotal N-1
Example 9.3: Testing Difference in Mean Time to Pain Relief Among 3 Treatments
An investigator wishes to compare the average time to relief of headache pain under three distinct medications, call them Drugs A, B and C. Fifteen patients who suffer from chronic headaches are randomly selected for the investigation, and five subjects are randomly assigned to each treatment. The following data reflect times to relief (in minutes) after taking the assigned drug: Drug A Drug B Drug C
30 25 15 35 20 20 40 30 25 25 25 20 35 30 20
![Page 2: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/2.jpg)
STAT3010: Lecture 5
2
Summary Statistics by Treatment
291x 252x 203x
025.021s 005.02
2s 025.023s
158.01s 071.02s 158.03s
To test whether the true mean times to relief under the three different drugs are equal, we use a five step procedure:
1. Set up the hypothesis.
2. Select the appropriate test statistic.
3. Compute the test statistic.
![Page 3: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/3.jpg)
STAT3010: Lecture 5
3
Analysis of Variance Table Degrees of
Source of Sums of Squares Freedom Mean Squares Variation (SS) (df) (MS) F
Between
Within
Total
4. Decision Rule.
5. Conclusion.
This ANOVA procedure utilizes several calculations (as do many statistical procedures)….the calculations are generally performed using a statistical software on a computer, so we’ll use SAS to evaluate this same example.
SAS CODE:
options ps=62 ls=80;
data headache;input trt $ time;
cards;
![Page 4: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/4.jpg)
STAT3010: Lecture 5
4
A 30A 35A 40A 25A 35B 25B 20B 30B 25B 30C 15C 20C 25C 20C 20run; proc print;run;
proc anova;
class trt;
model time=trt;
run;
SAS OUTPUT:
The SAS System
Obs trt time
1 A 302 A 353 A 404 A 255 A 35
![Page 5: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/5.jpg)
STAT3010: Lecture 5
5
6 B 257 B 208 B 309 B 2510 B 3011 C 1512 C 2013 C 2514 C 2015 C 20
The ANOVA ProcedureClass Level Information
Class Levels Values trt 3 A B C
Number of Observations Read 15 Number of Observations Used 15
The ANOVA Procedure
Dependent Variable: timeSum of
Source DF Squares Mean Square F Value Pr > F Model 2 423.3333333 211.6666667 10.16 0.0026 Error 12 250.0000000 20.8333333Corrected Total 14 673.3333333
R-Square Coeff Var Root MSE time Mean 0.628713 17.33299 4.564355 26.33333
Source DF Anova SS Mean Square F Value Pr > F trt 2 423.3333333 211.6666667 10.16 0.0026
Note: SAS has two procedures for analysis of variance applications. The first is the ANOVA procedure, which is used when the sample sizes are equal, and the second is the GLM (general linear models) procedure, which can be used when
![Page 6: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/6.jpg)
STAT3010: Lecture 5
6
the sample sizes are unequal or equal. Since the sample sizes are equal in example 9.3, we used the ANOVA procedure.
Example 9.5: Testing Difference in Mean Weight Gain Among 4 Different Diets
A study is developed to examine the effects of vitamin and milk supplements on infant weight gain. Four diet plans are considered: Diet A involves a regular diet plus the vitamin supplement Diet B involves a regular diet plus the special milk formula, Diet C is our control diet (no restrictions) and Diet D involves a regular diet plus the vitamin and the special milk formula. Twenty infants are selected for the investigation and each is randomized to one of the four competing diet programs. The following table displays weight gains, measured in pounds, after 1 month on the assigned diet:
Diet A Diet B Diet C Diet D 2.0 1.6 1.5 2.11.5 1.9 2.0 2.42.4 2.1 1.8 1.91.9 1.1 1.3 1.82.6 1.7 1.2 2.2
1.) Set up the hypothesis. 2.) Use SAS to compute the ANOVA Table; make a decision
and conclusion based on your output.SAS CODE:options ps=62 ls=80;data infants;
input diet $ gain; cards;A 2.0A 1.5A 2.4A 1.9
![Page 7: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/7.jpg)
STAT3010: Lecture 5
7
A 2.6B 1.6B 1.9B 2.1B 1.1B 1.7C 1.5C 2.0C 1.8C 1.3C 1.2D 2.1D 2.4D 1.9D 1.8D 2.2run; proc print;run; proc glm; class diet; model gain=diet; run; SAS OUTPUT: The SAS System
Obs diet gain1 A 2.02 A 1.53 A 2.44 A 1.95 A 2.66 B 1.67 B 1.98 B 2.19 B 1.1
10 B 1.711 C 1.512 C 2.0
13 C 1.814 C 1.315 C 1.216 D 2.117 D 2.4
![Page 8: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/8.jpg)
STAT3010: Lecture 5
8
18 D 1.919 D 1.820 D 2.2
The SAS System The GLM Procedure
Class Level Information
Class Levels Values diet 4 A B C D
Number of Observations Read 20 Number of Observations Used 20
The SAS System The GLM Procedure
Dependent Variable: gainSum of
Source DF Squares Mean Square F Value Pr > F Model 3 1.09400000 0.36466667 2.92 0.0659 Error 16 1.99600000 0.12475000Corrected Total 19 3.09000000
R-Square Coeff Var Root MSE gain Mean 0.354045 19.09187 0.353200 1.850000
Source DF Type I SS Mean Square F Value Pr > F diet 3 1.09400000 0.36466667 2.92 0.0659
Source DF Type III SS Mean Square F Value Pr > F diet 3 1.09400000 0.36466667 2.92 0.0659
Decision:
Conclusion:
![Page 9: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/9.jpg)
STAT3010: Lecture 5
9
Note: We always make conclusions based on the alternative hypothesis. Whether we reject or do not reject the null, we will always conclude on the alternative with “sufficient” or “insufficient” evidence to say that the means are not equal.
Fixed Versus Random Effects Models (Section 9.3, Page 424)
There’s two types of analysis of variance applications: fixed effects models and random effects models.
Fixed Effects Models:
Random Effects Models:
Note: We will only be using fixed effects models in the upcoming sections. Basically, these formulas only apply to fixed effects models.
![Page 10: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/10.jpg)
STAT3010: Lecture 5
10
Evaluating Treatment Effects (Section 9.4, Page 424)
This section is purely based on the decision “reject 0H ”. If an ANOVA is performed and it has been established that a significant difference in means exists, we then want to figure out how much variation in the data is due to the treatments.
We use the following statistic to find the ratio of variation due to the treatments ( bSS ) to the total variation:
![Page 11: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/11.jpg)
STAT3010: Lecture 5
11
Multiple Comparisons Procedures (Section 9.5, Page 425)
Now that we know when to reject/do not reject the null hypothesis, let’s consider some new comparisons. Let’s say we decide to reject the null hypothesis, and conclude that not all means are equal. What if we wanted to know, specifically, which means aren’t equal? For example, in example 9.3, we wanted to test to see if the mean times to relief of three different headache medications differed:
And we came up with the decision to reject the null hypothesis. So, we are saying that there is a significant difference in at least 2 of the headache medications. Suppose we are particularly interested in comparing only the first two medications:
Or the first and third:
Tests of this type are called pairwise comparisons, since they involve pairs of treatment means.
It is however, possible to construct more complicated comparisons: For example, Compare the mean time to relief for patients assigned to either Drug A or B to the mean time to relief for patients assigned to Drug C.
Both pairwise (two-at-a-time) and more complicated comparisons are generally called contrasts.
![Page 12: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/12.jpg)
STAT3010: Lecture 5
12
There are a number of statistical procedures for handling these applications, which are called multiple comparison procedures(MCP). For pairwise (two-at-a-time) comparisons, we will be looking at 2 popular multiple comparison procedures, the Scheffe and Tukey procedures. Next class, we’ll look at a different method for more complicated contrasts.
Remember: These MCP’s are only used when we’ve come up with the decision of “rejecting 0H ” in our ANOVA and a conclusion that the treatment means are significantly different.
The Scheffe Procedure
The Scheffe procedure is a multiple comparison procedure that controls the familywise error rate. This means that the P(type I error) is controlled (and equal to ) over the family of all comparisons.
Recall: Type I error?
Note: The Scheffe procedure is most commonly used when involving more than a few contrasts; however, it has lower statistical power compared to competing procedures.
Outline of the Scheffe Procedure:
1. Set up the hypotheses:
2. Compute the test statistic:
![Page 13: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/13.jpg)
STAT3010: Lecture 5
13
3. Decision Rule:
4. Conclusion. (We should all know how to write a conclusion by now!)
Okay, let’s do an example:
Example 9.7: Recall Example 9.3;
We compared the mean time to relief of headache pain under 3 competing medications and had the following hypothesis:
Analysis of Variance Table Degrees of
Source of Sums of Squares Freedom Mean Squares Variation (SS) (df) (MS) F
Between 423.329 2 211.66 10.1598
Within 250 12 20.833
Total 673.329 14
Since we don’t know which of the 3 treatments do not equal, we now wish to compare the medications taken two-at-a-time, (i.e., pairwise comparisons).
![Page 14: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/14.jpg)
STAT3010: Lecture 5
14
Summary Statistics by Treatment
51n 52n 53n
331x 262x 203x
7.51s 2.42s 5.33s
Drug A versus Drug B:
1. Hypothesis:
2. Test Statistic:
3. Decision:
4. Conclusion:
Drug A versus Drug C:
1. Hypothesis:
![Page 15: lec5.PDF](https://reader030.vdocuments.site/reader030/viewer/2022032600/55cf92ce550346f57b99b0e5/html5/thumbnails/15.jpg)
STAT3010: Lecture 5
15
2. Test Statistic:
3. Decision:
4. Conclusion:
Drug B versus Drug C:
1. Hypothesis:
2. Test Statistic:
3. Decision:
4. Conclusion:
Therefore, it is shown through the Scheffe comparison procedure that 31 .