1 analysis of variance chapter 14 2 introduction analysis of variance helps compare two or more...

11

Analysis of VarianceAnalysis of VarianceAnalysis of VarianceAnalysis of VarianceChapter 14Chapter 14

22

IntroductionIntroduction

Analysis of variance helps compare two or more Analysis of variance helps compare two or more populations of quantitative data.populations of quantitative data.Specifically, we are interested in the relationships Specifically, we are interested in the relationships among the population means (are they equal or among the population means (are they equal or not).not).The procedure works by analyzing the sample The procedure works by analyzing the sample variance.variance.

33

14.114.1 One - Way Analysis of Variance One - Way Analysis of Variance

The analysis of variance is a procedure that The analysis of variance is a procedure that tests to determine whether differences exits tests to determine whether differences exits among two or more population means.among two or more population means.

To do this, the technique analyzes the sample To do this, the technique analyzes the sample variancesvariances

44

Example 1Example 1– An apple juice manufacturer is planning to develop a new An apple juice manufacturer is planning to develop a new

product -a liquid concentrate.product -a liquid concentrate.– The marketing manager has to decide how to market the The marketing manager has to decide how to market the

new product.new product.– Three strategies are consideredThree strategies are considered

Emphasize convenience of using the product.Emphasize convenience of using the product.Emphasize the quality of the product.Emphasize the quality of the product.Emphasize the product’s low price.Emphasize the product’s low price.

One - Way Analysis of Variance :One - Way Analysis of Variance :

55

Example 1 - continuedExample 1 - continued– An experiment was conducted as follows:An experiment was conducted as follows:

In three cities an advertisement campaign was launched .In three cities an advertisement campaign was launched .

In each city only one of the three characteristics In each city only one of the three characteristics

(convenience, quality, and price) was emphasized.(convenience, quality, and price) was emphasized.

The weekly sales were recorded for twenty weeks The weekly sales were recorded for twenty weeks

following the beginning of the campaigns.following the beginning of the campaigns.


66


Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532

Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532

See file See file ((Xm1.xls)Xm1.xls)

Weekly sales

Weekly sales

Weekly sales

77

SolutionSolution– The data is quantitative.The data is quantitative.– Our problem objective is to compare sales in three Our problem objective is to compare sales in three

cities.cities.– We hypothesize on the relationships among the We hypothesize on the relationships among the

three mean weekly sales:three mean weekly sales:


88

H0: 1 = 2= 3

H1: At least two means differ

To build the statistic needed to test thehypotheses use the following notation:

• Solution

Defining the HypothesesDefining the Hypotheses

99

Independent samples are drawn from k populations (treatments).

1 2 kX11

x12

.

.

.Xn1,1

1

1x

n

X21

x22

.

.

.Xn2,1

2

2x

n

Xk1

xk2

.

.

.Xnk,1

k

kx

n

Sample sizeSample mean

First observation,first sample

Second observation,second sample

X is the “response variable”.The variables’ value are called “responses”.

NotationNotation

1010

TerminologyTerminology

In the context of this problem…In the context of this problem…Response variableResponse variable – weekly sales – weekly salesResponses Responses – actual sale values– actual sale valuesExperimental unitExperimental unit – weeks in the three cities when we record – weeks in the three cities when we record sales figures.sales figures.FactorFactor – the criterion by which we classify the populations (the – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy.treatments). In this problems the factor is the marketing strategy.Factor levelsFactor levels – the population (treatment) names. In this problem – the population (treatment) names. In this problem factor levels are the marketing trategies.factor levels are the marketing trategies.

1111

Two types of variability are employed Two types of variability are employed when testing for the equality of the when testing for the equality of the

population meanspopulation means

The rationale of the test statistic

1212

Graphical demonstration:Employing two types of variability

1313

20

25

30

1

7

Treatment 1 Treatment 2 Treatment 3

10

12

19

9

Treatment 1Treatment 2Treatment 3

20

161514

1110

9

10x1

15x2

20x3

10x1

15x2

20x3

The sample means are the same as before,but the larger within-sample variability makes it harder to draw a conclusionabout the population means.

A small variability withinthe samples makes it easierto draw a conclusion about the population means.

1414

The rationale behind the test statistic – I The rationale behind the test statistic – I

If the null hypothesis is true, we would expect all the If the null hypothesis is true, we would expect all the sample means be close to one another (and as a sample means be close to one another (and as a result to the grand mean).result to the grand mean).If the alternative hypothesis is true, at least some of If the alternative hypothesis is true, at least some of the sample means would reside away from one the sample means would reside away from one another.another.Thus, we measure variability among sample Thus, we measure variability among sample means. means.

1515

The variability among the sample means is The variability among the sample means is measured as the sum of squared distances measured as the sum of squared distances between each mean and the grand mean.between each mean and the grand mean.

This sum is called the This sum is called the SSum of um of SSquares for quares for TTreatmentsreatments

SSTSSTIn our example treatments arerepresented by the differentadvertising strategies.

Variability among sample meansVariability among sample means

1616

2

1

)( xxnSSTk

jjj

There are k treatments

The size of sample j The mean of sample j

Sum of squares for treatments (SSTR)Sum of squares for treatments (SSTR)

Note: When the sample means are close toone another, their distance from the grand mean is small, leading to amall SST. Thus, large SST indicates large variation among sample means, which supports H1.

1717

Solution – continuedSolution – continuedCalculate SSTCalculate SST

2k

1jjj

321

)xx(nSST

608.65x653.00x577.55x

= 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 == 57,512.23

The grand mean is calculated by

k21

kk2211

n...nnxn...xnxn

X

Sum of squares for treatments (SST)Sum of squares for treatments (SST)

1818

Is SST = 57,512.23 large enough to favor Is SST = 57,512.23 large enough to favor HH11??

See next.See next.

Sum of squares for treatments (SST)Sum of squares for treatments (SST)

1919

Large variability within the samples weakens the Large variability within the samples weakens the “ability” of the sample means to represent their “ability” of the sample means to represent their corresponding population means. corresponding population means. Therefore, even-though sample means may Therefore, even-though sample means may markedly differ from one another, large SST markedly differ from one another, large SST must be judged relative to the “within samples must be judged relative to the “within samples variability”. variability”.

The rationale behind test statistic – II The rationale behind test statistic – II

2020

The variability within samples is measured by The variability within samples is measured by adding all the squared distances between adding all the squared distances between observations and their sample means.observations and their sample means.

This sum is called the This sum is called the SSum of um of SSquares for quares for EError - rror -

SSE.SSE.In our example this is the sum of all squared differencesbetween sales in city j and thesample mean of city j (over all the three cities).

Within samples variability Within samples variability

2121

Solution – continuedSolution – continuedCalculate SSECalculate SSE

Sum of squares for errors (SSE) Sum of squares for errors (SSE)

k

1j

2ij

n

1i

23

22

21

)xx(SSE

24.670,8s11,238,7s00.775,10sj

(n1 - 1)S12 + (n2 -1)S2

2 + (n3 -1)S32

= (20 -1)10,774.44 + (20 -1)7238.61+ (20-1)8,669.47 = = 506,967.88

2222

• Note: If SST is small relative to SSE, we Note: If SST is small relative to SSE, we can’t infer that treatments are the cause can’t infer that treatments are the cause for different average performance.for different average performance.

• Is SST = 57,512.23 large enough relative Is SST = 57,512.23 large enough relative to SSE = 506,983.50 to argue that the to SSE = 506,983.50 to argue that the means means AREARE different? different?

Sum of squares for errors (SSE) Sum of squares for errors (SSE)

2323

To perform the test we need to calculate the mean sum of squaresmean sum of squares as follows:

The mean sum of squares The mean sum of squares

Calculation of MST - Mean Square for Treatments

28,756.1213

57,512.231k

SSTMST

Calculation of MSEMean Square for Error

8,894.17360

509,967.88kn

SSEMSE

2424

Calculation of the test statistic Calculation of the test statistic

3.238,894.1728,756.12MSEMST

F

with the following degrees of freedom:v1=k -1 and v2=n-k

We assume:1. The populations tested are normally distributed.2. The variances of all the populations tested are equal.

For honors class:Testing equal variances

For honors class:Testing normality

2525

And finally the hypothesis test:

H0: 1 = 2 = …=k

Ha: At least two means differ

Test statistic:

R.R: F>F,k-1,n-k

MSEMST

F

The F test rejection region The F test rejection region

2626

The F testThe F test

Ho: 1 = 2= 3

H1: At least two means differ

Test statistic F= MST MSE= 3.2315.3FFF:.R.R 360,13,05.0knk 1

Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.

3.238,894.1728,756.12MSEMST

F

2727

-0.02

0

0.02

0.04

0.06

0.08

0.1

0 1 2 3 4

Use Excel to find the p-valueUse Excel to find the p-value =FDIST(3.23,2,57) = .0467=FDIST(3.23,2,57) = .0467

The F test p- value The F test p- value

p Value = P(F>3.23) = .0467

2828

Excel single factor printoutExcel single factor printoutAnova: Single Factor

SUMMARYGroups Count Sum Average Variance

Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474

Total 564495.73 59

Anova: Single Factor

SUMMARYGroups Count Sum Average Variance

Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395

ANOVASource of Variation SS df MS F P-value F crit

Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474

Total 564495.73 59

SS(Total) = SST + SSE

See file See file ((Xm1.xls)Xm1.xls)

If the single factor ANOVA leads us to conclude at least If the single factor ANOVA leads us to conclude at least two means differ, we often wants to know which ones.two means differ, we often wants to know which ones.Two means are considered different if the difference Two means are considered different if the difference between the corresponding sample means is larger than between the corresponding sample means is larger than a critical number. a critical number. The larger sample mean is believed to be associated The larger sample mean is believed to be associated with a larger population mean.with a larger population mean.

14.2 Multiple Comparisons14.2 Multiple Comparisons

3030

Fisher’s Least Significant DifferenceFisher’s Least Significant DifferenceThe Fisher’s Least Significant (LSD) method is one The Fisher’s Least Significant (LSD) method is one procedure designed to determine which mean difference procedure designed to determine which mean difference is significant.is significant.The hypotheses are:The hypotheses are:

HH00: |: |ii – – jj| = 0| = 0HHaa: |: |ii – – jj| | 0. 0.

The statistic: The statistic: ji xx

This method builds on the equal variance t-test of the This method builds on the equal variance t-test of the difference between two means.difference between two means.The test statistic is improved by using MSE rather than sThe test statistic is improved by using MSE rather than spp

22..

We can conclude that We can conclude that ii and and jj differ (at differ (at % significance % significance level if |level if |ii - - jj| > LSD, where| > LSD, where

kn.f.d

)n1

n1

(MSEtLSDji

2

Fisher’s Least Significant DifferenceFisher’s Least Significant Difference

Experimentwise type I error rate (Experimentwise type I error rate (EE))(the effective type I error)(the effective type I error)

The Fisher’s method may result in an increased probability of The Fisher’s method may result in an increased probability of committing a type I error.committing a type I error.The probability of committing at least one type I error in a series of C The probability of committing at least one type I error in a series of C hypothesis tests each at hypothesis tests each at level of significance is increasing too. level of significance is increasing too. This probability is called experimentwise type I error rate (This probability is called experimentwise type I error rate (EE ). It ). It isiscalculated bycalculated by

EE = 1-(1 – = 1-(1 – ))CC

where C is the number of pairwise comparisons (C = k(k-1)/2, k is the where C is the number of pairwise comparisons (C = k(k-1)/2, k is the number of treatments)number of treatments)The Bonferroni adjustment determines the required The Bonferroni adjustment determines the required type I error type I error probability per pairwise comparison (probability per pairwise comparison () ,) , to secure a pre-determined to secure a pre-determined overall overall EE

The procedure:The procedure:– Compute the number of pairwise comparisons (C)Compute the number of pairwise comparisons (C)

[C=k(k-1)/2], where k is the number of [C=k(k-1)/2], where k is the number of populations/treatments.populations/treatments.

– Set Set = = EE/C, where the value of /C, where the value of EE is predetermined is predetermined

– We can conclude that We can conclude that ii and and jj differ (at differ (at /C% significance /C% significance level iflevel if

kn.f.d

nnMSEt

jiji

11

(2C)αE

The Bonferroni AdjustmentThe Bonferroni Adjustment

35.4465.6080.653xx

10.3165.60855.577xx

45.750.65355.577xx

32

31

21

Example1 - continuedExample1 - continued– Rank the effectiveness of the marketing strategiesRank the effectiveness of the marketing strategies

(based on mean weekly sales).(based on mean weekly sales).– Use the Fisher’s method, and the Bonferroni adjustment methodUse the Fisher’s method, and the Bonferroni adjustment method

Solution (the Fisher’s method)Solution (the Fisher’s method)– The sample mean sales were 577.55, 653.0, 608.65.The sample mean sales were 577.55, 653.0, 608.65.– Then, Then,

71.59)20/1()20/1(8894t

)n1

n1

(MSEt

2/05.

ji

2

The Fisher and Bonferroni methodsThe Fisher and Bonferroni methods

The significant difference is between 1 and 2.

Solution (the Bonferroni adjustment)Solution (the Bonferroni adjustment)– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.We calculate C=k(k-1)/2 to be 3(2)/2 = 3.– We set We set = .05/3 = .0167, thus t = .05/3 = .0167, thus t.0167.01672, 60-32, 60-3 = 2.467 (Excel). = 2.467 (Excel).

54.73)20/1()20/1(8894467.2

)n1

n1

(MSEtji

2

Again, the significant difference is between 1 and 2.

35.4465.6080.653xx

10.3165.60855.577xx

45.750.65355.577xx

32

31

21

The Fisher and Bonferroni methodsThe Fisher and Bonferroni methods

The test procedure:The test procedure:– Find a critical number Find a critical number as follows: as follows:

gnMSE

),k(q

k = the number of samples =degrees of freedom = n - kng = number of observations per sample (recall, all the sample sizes are the same) = significance levelq(k,) = a critical value obtained from the studentized range table

The Tukey Multiple ComparisonsThe Tukey Multiple ComparisonsIf the sample sizes are not extremely different, we can use the above procedure with ng calculated as the harmonic mean of the sample sizes.

The test procedure:The test procedure:– Find a critical number Find a critical number as follows: as follows:

gnMSE

),k(q

k = the number of samples =degrees of freedom = n - kng = number of observations per sample = significance levelq(k,) = a critical value obtained from the studentized range table

The Tukey Multiple ComparisonsThe Tukey Multiple Comparisons

recall, all the sample sizes are the same

k21 n1...n1n1k

gn


If the sample sizes are not the same, but don’t differ much from one another, we can use the harmonic mean of the sample sizes for ng.

Recall, all the sample sizes are the same

• Repeat this procedure for each pair of samples. Rank the means if possible.

Select a pair of means. Calculate the difference between the Select a pair of means. Calculate the difference between the larger and the smaller mean.larger and the smaller mean.

• If there is sufficient evidence to

conclude that max > min .

minmax xx

minmax xx


City 1 vs. City 2: 653 - 577.55 = 75.45City 1 vs. City 3: 608.65 - 577.55 = 31.1City 2 vs. City 3: 653 - 608.65 = 44.35

Example 1 – continued. We had three populations Example 1 – continued. We had three populations (three marketing strategies).(three marketing strategies).K = 3,K = 3,

Sample sizes were equal. nSample sizes were equal. n11 = n = n22 = n = n33 = 20, = 20,= n-k = 60-3 = 57,= n-k = 60-3 = 57,MSE = 8894.MSE = 8894.

minmax xx

71.7020

8894(3,57)q.

n

MSEν)(k,qω 05

gα

Take q.05(3,60) from the table.

Population

Sales - City 1Sales - City 2Sales - City 3

Mean

577.55653698.65

minmax xx


Excel – Tukey and Fisher LSD methodExcel – Tukey and Fisher LSD method

Xm15 -1.xls

Multiple ComparisonsOmega = 71.7007033950796Variable Variable Difference LSD

1 2 -75.45 59.720673 -31.1 59.72067

2 3 44.35 59.72067

Fisher’s LDS

Multiple ComparisonsOmega = 71.7007033950796Variable Variable Difference LSD

1 2 -75.45 73.541763 -31.1 73.54176

2 3 44.35 73.54176

Bonferroni adjustments = .05

Type = .05/3 = .0167

4242

14.3 Randomized Blocks Design14.3 Randomized Blocks Design

The purpose of designing a randomized block The purpose of designing a randomized block experiment is to reduce the experiment is to reduce the within-treatments within-treatments variation variation thus increasing thethus increasing the relative amount of relative amount of among-treatment variation.among-treatment variation.

This helps in detecting differences among the This helps in detecting differences among the treatment means more easily.treatment means more easily.

The Block of dark blues

4343

Treatment 4

Treatment 3

Treatment 2

Treatment 1

The Block ofbluish purples

Randomized BlocksRandomized Blocks

The block of Greyish pinks

4444

The sum of square total is partitioned into three The sum of square total is partitioned into three sources of variationsources of variation– TreatmentsTreatments– BlocksBlocks– Within samples (Error)Within samples (Error)

SS(Total) = SST + SSB + SSESS(Total) = SST + SSB + SSE

Sum of square for treatments Sum of square for blocks Sum of square for error

Recall. For the independent samples design we have: SS(Total) = SST + SSE

Partitioning the total variabilityPartitioning the total variability

4545

To perform hypothesis tests for treatments and blocks we need

• Mean square for treatments• Mean square for blocks• Mean square for error

The mean sum of squareThe mean sum of square

1kSST

MST

1bSSB

MSB

1)1)(b(k

SSEMSE

4646

The test statistic for the randomized block The test statistic for the randomized block design ANOVAdesign ANOVA

MSEMST

F

MSEMSB

F

Test statistics for treatments

Test statistics for blocks

4747

Testing the mean responses for treatmentsTesting the mean responses for treatments

F > FF > F,k-1,(k-1)(b-1),k-1,(k-1)(b-1)

Testing the mean response for blocksTesting the mean response for blocks

F> FF> F,b-1,(k-1)(b-1),b-1,(k-1)(b-1)

The F test rejection regionThe F test rejection region

4848

Example 2Example 2– Are there differences in the effectiveness of cholesterol Are there differences in the effectiveness of cholesterol

reduction drugs? reduction drugs? – To answer this question the following experiment was To answer this question the following experiment was

organized:organized:25 groups of men with high cholesterol were matched by age 25 groups of men with high cholesterol were matched by age and weight. Each group consisted of 4 men.and weight. Each group consisted of 4 men.Each person in a group received a different drug.Each person in a group received a different drug.The cholesterol level reduction in two months was recorded.The cholesterol level reduction in two months was recorded.

– Can we infer from the data in Xm2.xls that there are Can we infer from the data in Xm2.xls that there are differences in mean cholesterol reduction among the four differences in mean cholesterol reduction among the four drugs? drugs?

Randomized Blocks ANOVA - ExampleRandomized Blocks ANOVA - Example

Additional example

4949

SolutionSolution– Each drug can be considered a treatment.Each drug can be considered a treatment.

– Each 4 records (per group) can be blocked, because Each 4 records (per group) can be blocked, because they are matched by age and weight.they are matched by age and weight.

– This procedure eliminates the variability in This procedure eliminates the variability in cholesterol reductioncholesterol reduction related to related to different different combinations of age and weightcombinations of age and weight. .

– This helps detect differences in the mean cholesterol This helps detect differences in the mean cholesterol reductionreduction attributed to the different drugs. attributed to the different drugs.


5050

ANOVA

Source of Variation SS df MS F P-value F crit

Rows 3848.657 24 160.3607 10.10537 9.7E-15 1.669456Columns 195.9547 3 65.31823 4.116127 0.009418 2.731809Error 1142.558 72 15.86886

Total 5187.169 99

BlocksTreatments b-1 MSTR / MSE MSBL / MSE

Conclusion: At 5% significance level there is sufficient evidence to infer that the mean “cholesterol reduction” gained by at least two drugs are different.

K-1


5151

>

The rejection region:The rejection region:

ji xx

jiknα/2, n

1n1

MSEt

14.2 Multiple Comparisons14.2 Multiple Comparisons

44.35xx

31.10xx

75.45xx

32

31

21 >59.72

<59.72

<59.72

Testing the differencesExample – continued

Calculating LSD:MSE = 8894.44; n1 = n2 = n3 =20. t.05/2,60-3 = tinv(.05,57) = 2.002LSD=(2.002)[8894.44(1/20+1/20)].5 = 59.72

1 analysis of variance chapter 14 2 introduction analysis of variance helps compare two or more...

Documents

sample means

sample variability

analysis of variancechapter

sample variances

population treatment

mean weekly sales

sales figures

sample sizesample meanx