comparisons of several multivariate means

55
Lecture #6 - 9/28/2005 Slide 1 of 54 Comparisons of Several Multivariate Means Lecture 6 September 28, 2005 Multivariate Analysis

Upload: lecong

Post on 31-Dec-2016

246 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparisons of Several Multivariate Means

Lecture #6 - 9/28/2005 Slide 1 of 54

Comparisons of Several Multivariate Means

Lecture 6September 28, 2005Multivariate Analysis

Page 2: Comparisons of Several Multivariate Means

Overview

Today’s Lecture

Schedule Change

Tests for Two Mean Vectors

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 2 of 54

Today’s Lecture

■ Comparisons of Two Mean Vectors (Chapter 5).

◆ Paired comparisons/repeated measures for two timepoints.

◆ Comparing mean vectors from two populations.

■ Review of univariate ANOVA in preparation for next week’slecture.

■ Note: Homework # 4 due in my mailbox on Thursday (by5pm).

Page 3: Comparisons of Several Multivariate Means

Overview

Today’s Lecture

Schedule Change

Tests for Two Mean Vectors

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 3 of 54

Schedule Change

■ Because of a couple of developments, we will have a slightchange in our schedule:

Today: We cover one part of several mean comparisons(Chapter 6, Sections 1-3).

10/5: We cover MANOVA in all its “glory.”

10/12: We cover multivariate multiple regression (and themidterm is handed out, making it due 10/26).

12/7: No class (I will be at the National Council onResponsible Gaming conference in Las Vegas).

■ I will amend the topics list as the course progresses.

Page 4: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 4 of 54

Tests for Two Mean Vectors

■ We begin our discussion of multiple multivariate meanvector comparisons by generalizing our results from theprevious lecture.

◆ Using T 2 for comparisons of two mean vectors.

■ Differing types of comparisons are all incorporated into ageneral framework for performing statistical hypothesistests:

◆ Paired comparisons.

◆ Repeated measures.

■ Much like the univariate t-test, we will also see howcomparisons are performed when we have samples fromtwo different populations.

Page 5: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 5 of 54

Paired Comparisons

■ A research design that can be used to investigate theeffects of a single treatment is a paired-comparison study.

■ A basic version of a paired comparison study involves thecreation of two groups (treatment one and treatment two),where individuals from each group are matched on a set ofvariables.

■ Individuals in the first treatment group are given the firsttreatment, individuals in the second treatment group aregiven the second treatment.

■ Data is collected from the two groups.

■ The null hypothesis is that there are not differencesbetween the groups.

■ Paired comparison tests are needed to assess the efficacyof a treatment or (as we will see) multiple treatments.

Page 6: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 6 of 54

Univariate Statistics

■ If a single response is elicited from individuals in bothgroups, univariate methods can be used to test fordifferences in the means of both conditions.

■ One way to investigate the difference between groups is tocreate a difference score:

Dj = Xj1 − Xj2

Where:

◆ Xj1 is the response to treatment one for individual j.

◆ Xj2 is the response to treatment two for “matched”individual j.

◆ (Xj1, Xj2) are measurements recorded onpaired/matched/like units j.

Page 7: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 7 of 54

Univariate Statistics

■ If a single response is elicited from individuals in bothgroups, univariate methods can be used to test fordifferences in the means of both conditions.

■ One way to investigate the difference between groups is tocreate a difference score:

Dj = Xj1 − Xj2

Where:

◆ Xj1 is the response to treatment one for individual j.

◆ Xj2 is the response to treatment two for “matched”individual j.

◆ (Xj1, Xj2) are measurements recorded onpaired/matched/like units j.

Page 8: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 8 of 54

Univariate Statistics

■ If the differences, D, are from a normal distribution withmean δ and variance σ2

d, or N(δ, σ2d), then a hypothesis test

can be formed based on δ:

H0 : δ = 0

H1 : δ 6= 0

■ This hypothesis test can be evaluated using the teststatistic:

t =D − δ

sd/√

N

■ t has n − 1 df (where n is the number of matched pairs).

Page 9: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 9 of 54

Univariate Statistics

■ Of course, the 100 × (1 − α)% confidence interval for δ isgiven by:

d − tn−1(α/2)sd√n≤ δ ≤ d + tn−1(α/2)

sd√n

Page 10: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 10 of 54

Multivariate Paired Comparisons

■ As you could probably guess, the multivariate analog of thepaired-comparison t-test is a straight forward extension ofthe methodology.

■ The research design of the paired comparison is virtuallythe same as previously discussed, however, multipleobservations are taken from each group:

X1j1 = variable 1 under treatment 1X1j2 = variable 2 under treatment 1

...X1jp = variable p under treatment 1

X2j1 = variable 1 under treatment 2X2j2 = variable 2 under treatment 2

...X2jp = variable p under treatment 2

Page 11: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 11 of 54

Multivariate Paired Comparisons

■ Similar to the univariate case, the method of testing fordifferences between the mean vectors of the two groupsutilizes difference scores.

■ Instead of a single difference score, p difference scores arecreated:

Dj1 = X1j1 − X2j1

Dj2 = X1j2 − X2j2

......

Djp = X1jp − X2jp

■ The mean vector of the set of difference scores can bedenoted by: E(D) = δ.

■ The covariance matrix of the set of difference scores canbe denoted by: Cov(D) = Σd.

Page 12: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 12 of 54

Multivariate Paired Comparisons

■ If the matrix of differences, D, is from a MVN distributionwith mean vector δ and covariance matrix Σd, orNp(δ,Σd), then a hypothesis test can be formed based onδ:

H0 : δ = 0

H1 : δ 6= 0

■ This hypothesis test can be evaluated using the teststatistic:

T 2 = n(D − δ)′S−1d (D − δ)

■ Where:

D = 1n

∑n

j=1 Dj Sd = 1n−1

∑n

j=1(Dj − D)(Dj − D)′

Page 13: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 13 of 54

Multivariate Paired Comparisons

■ SAS Example #1...The first of what will be a parade of baddata examples.

■ From page 275 in Johnson and Wichern:

Consider a municipal wastewater treatment plantdesiring to monitor water discharges into rivers andstreams. Concern about the reliability of data fromone of these self-monitoring programs led to a studyin which samples of effluent were divided and sent totwo laboratories for testing. One half of each samplewas sent to the Wisconsin State Laboratory ofHygiene, and one half was sent to a privatecommercial laboratory routinely used in themonitoring program. Measurements of biochemicaloxygen demand (BOD) and suspended solids (SS)were obtained for n = 11 sample splits from the twolabs.

Page 14: Comparisons of Several Multivariate Means

13-1

Page 15: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 14 of 54

Multivariate Paired Comparisons

■ The T 2 statistic is what we have worked with in theprevious week, meaning the value of T 2 that would lead toa rejection of H0 (with Type I error rate α) is found by:

(n − 1)p

(n − p)Fp,n−p(α)

■ This also means that 100 × (1 − α)% simultaneousconfidence intervals are formed by:

di ±

(n − 1)p

(n − p)Fp,n−p(α)

s2di

n

Page 16: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 15 of 54

Multivariate Paired Comparisons

■ One thing to note (because this will be a recurring method)is that both d and Sd (and T 2) can be formed by matrixequations.

■ First, consider the following matrices formed by the raw(unsubtracted) observations:

x(2p×1) =

x11

x12

...x1p

x21

x22

...x1p

; S(2p×2p) =

[

S11 S12

S21 S22

]

■ Saa′ is the covariance matrix between all variables forgroup a with all variables in group a′.

Page 17: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 16 of 54

Multivariate Paired Comparisons

■ We can construct a new matrix of constants that will aid inour computation of T 2:

C(p×2p) =

1 0 . . . 0 −1 0 . . . 0

0 1 . . . 0 0 −1 . . . 0...

.... . .

......

.... . .

...0 0 . . . 1 0 0 . . . −1

■ The T 2 statistic testing for δ = 0 can then be formed by:

T 2 = nx′C′(CSC′)−1Cx

Page 18: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 17 of 54

Multivariate Paired Comparisons

■ How is T 2 = nx′C′(CSC′)−1Cx equivalent to the T 2

phrased in terms of difference scores?

■ Recall the original formula shown a minute ago (but withδ = 0) :

T 2 = nD′S−1

d D

■ Also recall how linear combinations of variables producediffering mean vectors and covariance matrices:

◆ Imagine then that you combine all multivariate variables into a new compositevariable zj :

zj = a1xj1 + a2xj2 + . . . + apxjp

z = ax

s2

z = a′Sxa

■ What does this imply about D and Sd?

Page 19: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 18 of 54

Repeated Measures Comparisons

■ Instead of testing for treatments across matched pairs,another method of research is to administer a set oftreatments to each subject.

■ For this design, we have j subjects, each receiving qtreatments.

■ For this case, we are only concerned with:

◆ A single response variable measured for each subject jat each time point q:

Xj =

Xj1

Xj2

...Xjq

◆ Detecting if all treatments have equivalent means.

Page 20: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 19 of 54

Repeated Measures Comparisons

■ We can construct a mean vector for the mean at each timepoint q, of size (q × 1).

■ We then construct a Contrast Matrix (what we used for thefinal result for paired comparisons).

■ This contrast matrix specifies a linear combination that willperform the comparison test for equality of all means.

■ Note that equality of all means can be tested by severaldifferent linear combinations:

◆ Comparing all means with a given mean (such ascomparing the mean of time 1 with all other time points).

◆ Comparing successive means (mean of time 1 withmean of time 2; mean of time 2 with mean of time 3; andso on...).

Page 21: Comparisons of Several Multivariate Means

Lecture #6 - 9/28/2005 Slide 20 of 54

Contrast Matrix

■ Comparing all means with a given mean (such as comparing the mean oftime 1 with all other time points).

µ1 − µ2

µ1 − µ3

...µ1 − µq

=

1 −1 0 . . . 0

1 0 −1 . . . 0...

......

. . ....

1 0 0 . . . −1

µ1

µ2

...µq

= C1µ

■ Comparing successive means (mean of time 1 with mean of time 2;mean of time 2 with mean of time 3; and so on...).

µ2 − µ1

µ3 − µ2

...µq − µq−1

=

−1 1 0 . . . 0 0

0 −1 1 . . . 0 0...

......

. . ....

...0 0 0 . . . −1 1

µ1

µ2

...µq

= C2µ

Page 22: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 21 of 54

Repeated Measures Comparisons

■ For either contrast matrix (C1 or C2), it can be shown thatthe resulting test statistic is equivalent.

■ The test statistic for the repeated measures hypothesis ofequal means at each time point is:

T 2 = nx′C′(CSC′)−1Cx

■ If this looks familiar, then the distribution of this statisticwill, too.

■ The critical value for rejecting H0 of equal means at eachtime point is found from:

(n − 1)p

(n − p)Fp,n−p(α)

Page 23: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 22 of 54

SAS Example #2

■ From page 280 of Johnson and Wichern:Improved anesthetics are often developed by firststudying their effects on animals. In one study, 19dogs were initially given the drug pentobarbitol. Eachdog was then administered carbon dioxide at each oftwo pressure levels. Next, halothane was added andthe administration of CO2 was repeated. Theresponse, in milliseconds between heartbeats, wasmeasured for the four treatment conditions.

◆ Treatment 1 = high CO2 w/o H◆ Treatment 2 = low CO2 w/o H◆ Treatment 3 = high CO2 w/ H◆ Treatment 4 = low CO2 w H

Page 24: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 23 of 54

SAS Example #2

■ There are three contrasts of interest:

1. (µ3 + µ4) − (µ1 + µ2) - H

2. (µ1 + µ3) − (µ2 + µ4) - CO2

3. (µ1 + µ4) − (µ2 + µ3) - Interaction

Page 25: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 24 of 54

Comparing Two Populations

■ We have been over variations of the T 2 test for meanvectors for paired comparisons or for repeated measuresdesigns - both of which were cases where the sample wasfrom a single population.

■ To make the method of testing more general, consider thecomparison of mean vectors from two populations.

■ As with every multivariate test presented up to this point,we again turn our attention to similar tests for univariatecases.

Page 26: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 25 of 54

Univariate Two-Population t-tests

■ Recall from univariate statistics situations where sampleswere taken from two different populations.

■ For instance, reliving our height example from the pastweek, imagine that I am interested in comparing theheights of KU grad students with the heights of gradstudents at Illinois (the real reason Bill Self left).

■ Let x1j represent the height of a grad student at Illinois.

■ Let x2j represent the height of a grad student at KU.

■ We can then construct the following summary statistics foreach sample:

◆ The mean heights: x1 and x2.

◆ The variances: s21 and s2

2

Page 27: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 26 of 54

Univariate Two-Population t-tests

■ We seek to test the following hypothesis:

H0 : µ1 = µ2

H1 : µ1 6= µ2

■ One way of doing this is to compute a two-sample t-test.

■ For small samples, we must assume that s21 = s2

2, leadingto the following test:

t =x1 − x2

s2p

(

1n1

+ 1n2

)

■ Here, this test is compared against a t-value for Type I errorrate α with n1 + n2 − 2 degrees of freedom.

Page 28: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 27 of 54

Univariate Two-Population t-tests

■ If the assumption of equivalent variances is used s21 = s2

2,the pooled estimate of variance is:

s2p =

(n1 − 1)s21 + (n2 − 1)s2

2

n1 + n2 − 2

■ If this assumption is violated (and sample sizes are large),the pooled variance estimate is sometimes computed by:

s2p =

1

n1s21 +

1

n2s22

■ Now we will see how these values generalize to theirmultivariate counterparts.

Page 29: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 28 of 54

Multivariate Two-Population t-tests

■ For our multivariate test, consider taking a sample of morethan one variable from two populations.

■ This would be equivalent to our example of height andforearm length from last week.

■ Now we wish to test the hypothesis that the mean vectorsare equal against the alternative that the mean vectors arenot equal:

H0 : µ1 = µ2 ↔ µ1 − µ2 = δ0

H1 : µ1 6= µ2 ↔ µ1 − µ2 6= δ0

Page 30: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 29 of 54

Assumptions

■ Prior to conducting our hypothesis test, we must first notethe assumptions of the test:

1. The sample X11, X12, . . . , X1n1is a random sample of

size n1 from a p-variate population with mean vector µ1

and covariance matrix Σ1

2. The sample X21, X22, . . . , X2n2is a random sample of

size n2 from a p-variate population with mean vector µ2

and covariance matrix Σ2

3. X11, X12, . . . , X1n1are independent of X21, X22, . . . , X2n2

.

4. If n1 and n2 are small: both populations are MVN.

5. If n1 and n2 are small: Σ1 = Σ2

Page 31: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 30 of 54

Test Statistic

■ For small samples satisfying assumptions #1-#5, the teststatistic is:

T 2 = (x1 − x2 − δ0)′

[(

1

n1+

1

n2

)

Sp

]−1

(x1 − x2 − δ0)

where the pooled covariance matrix is formed by:

Sp =n1 − 1

n1 + n2 − 2S1 +

n2 − 1

n1 + n2 − 2S2

■ The critical value for rejecting H0 of equal population meanvectors is found from:

(n1 + n2 − 2)p

(n1 + n2 − 1 − p)Fp,n1+n2−1−p(α)

Page 32: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 31 of 54

Confidence Regions

■ Similar to all tests up to this point, we can constructconfidence regions for the difference in means.

■ Here, the region is centered at:

x1 − x2

■ The region has axes that are formed by:

x1 − x2 ±√

λi

(n1 + n2 − 2)p

(n1 + n2 − 1 − p)Fp,n1+n2−1−p(α)ei

■ Where the eigenvalues and eigenvectors come from thepooled covariance matrix Sp.

Page 33: Comparisons of Several Multivariate Means

Lecture #6 - 9/28/2005 Slide 32 of 54

Simultaneous Confidence Intervals

■ Again, similar to all tests up to this point, we can construct a set ofsimultaneous univariate confidence intervals for the difference in means.

■ The 100 × (1 − α)% confidence interval for µz = aµx is given by:

a′(x1 − x2) ±

(n1 + n2 − 2)p

(n1 + n2 − 1 − p)Fp,n1+n2−1−p

(

1

n1+

1

n2

)

a′Spa

■ SAS Example #3...

Fifty bars of soap are manufactured in each of two ways. Twocharacteristics - X1 = lather and X2 = mildness, are measured.Compare the two methods to detect mean differences in either ofX1 or X2.

Page 34: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

Paired Comparisons

Multivariate Comparisons

Repeated Measures

Comparing Two Populations

Univariate t-tests

Multivariate t-tests

1-Way ANOVA

Wrapping Up

Lecture #6 - 9/28/2005 Slide 33 of 54

Unequal Covariance Matrices

■ Often times, covariance matrices are not equal for differingpopulations.

■ Although statistical tests have been created to detectdiffering covariance matrices, their results are suspectwhen the populations are not MVN.

■ The book suggests that any element of the covariancematrices for either sample that is greater than four timesthe corresponding element for the other sample shouldcause concern.

■ If n1 and n2 are large, the pooled covariance estimate canbe formed by:

Sp =1

n1S1 +

1

n2S2

■ If sample sizes are small, the book suggests tryingtransformations.

Page 35: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 34 of 54

MANOVA

■ All through this point we have been dealing with caseswhere we were comparing one mean to another.

■ Because of this, we were always able to simply compute astandardized distance between them and see if it was tooextreme.

■ Now we are going to discuss what we will do if we severalmeans (meaning two or more).

Page 36: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 35 of 54

The Rest of Chapter 6

■ Specifically, we will begin with a scenario where we have atotal of k groups and we want to test the hypothesis

◆ H0: µ1 = µ2 = . . . = µk

■ Now because we have several mean vectors it will notmake sense to simply compute a squared distance and sowe will look at something else.

■ Similar to the ANOVA (which looks at variances) we willalso focus on variances, but in a multivariate setting,making it Multivariate ANOVA, or MANOVA.

Page 37: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 36 of 54

Univariate ANOVA

■ We begin with a reminder of what we did when all we hadwas a single observation per unit and there were severalgroups.

■ So our data looked like

Group 1 Group 2 . . . Group k

x11 x21 . . . xk1

x12 x22 . . . xk2

......

...x1n x2n . . . xkn

■ We want to test the basic hypothesis .

◆ H0: µ1 = µ2 = . . . = µk

Page 38: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 37 of 54

Assumptions

■ Here we had to make the basic assumptions that:

◆ Each observation came from a normal distribution withmean µ and variance σ2 (we assume equal variancesacross groups)

◆ All observations are independent

■ The basic linear model is yij = µ + τi + ǫij = µi + ǫij

Page 39: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 38 of 54

Example Variable: Two Categories

■ From Pedhazur (1997; p. 343): “Assume that the datareported [below] were obtained in an experiment in whichE represents an experimental group and C represents acontrol group.

E C

20 1018 1217 1117 1513 17

Y 85 65Y 17 13∑

(Y − Y )2 =∑

y2 26 34

Page 40: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 39 of 54

Effect Coding

■ Effect coding is a method of coding categorical variablescompatible with the general linear model ANOVAparameterization.

■ In effect coding, one creates a set of column vectors thatrepresent the membership of an observation to a givencategory level or experimental condition.

■ The total number of column vectors for a categoricalvariable are equal to one less than the total number ofcategory levels.

Page 41: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 40 of 54

Effect Coding

■ If an observation is a member of a specific category level,they are given a value of 1 in that category level’s columnvector.

■ If an observation is not a member of a specific categoryand is not a member of the omitted category, they aregiven a value of 0 in that category level’s column vector.

■ If an observation is a member of the omitted category, theyare given a value of -1 in every category level’s columnvector.

Page 42: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 41 of 54

Effect Coding

■ For each observation, a no more that a single 1 will appearin the set of column vectors for that variable.

■ The column vectors represent the predictor variables in aregression analysis, where the dependent variable ismodeled as a function of these columns.

■ Because all observations at a given category level havethe same value across the set of predictors, the predictedvalue of the dependent variable, Y ′, will be identical for allobservations within a category.

■ The set of category vectors (and a vector for an intercept)are now used as input into a regression model.

Page 43: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 42 of 54

Effect Coded Regression Example

Y X1 X2 Group

20 1 1 E

18 1 1 E

17 1 1 E

17 1 1 E

13 1 1 E

10 1 -1 C

12 1 -1 C

11 1 -1 C

15 1 -1 C

17 1 -1 C

Mean 15 1 0

Page 44: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 43 of 54

Effect Coded Regression

■ The General Linear Model states that the estimatedregression parameters are given by:

b = (X′X)−1X′y

Page 45: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 44 of 54

Effect Coded Regression - X1 and X2

■ For our second example analysis, consider the regressionof Y on X1 and X2.

Y = a + b2X2 + e

■ a = 15

■ b2 = 2

■∑

y2 = 100

■ SSres = X′X = 60

■ SSreg = 100 − 60 = 40

■ R2 = 40100 = 0.4

Page 46: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 45 of 54

Effect Coded Regression - X1 and X2

■ a = 15 is the overall mean of the dependent variableacross all categories.

■ b2 = 2 is the called the effect of the experimental group.

■ This effect represents the difference between theexperimental group mean and the overall mean.

■ For members of the E category:

Y ′ = a + b2X2 = 15 + 2(1) = 17

■ For members of the C category:

Y ′ = a + b2X2 = 15 + 2(−1) = 13

■ The fit of the model is the same as was found in thedummy coding from the previous class.

Page 47: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 46 of 54

The Fixed Effects Linear Model

■ Effect coding is built to estimate the fixed linear effectsmodel.

Yij = µ + τj + ǫij

■ Yij is the value of the dependent variable of individual i ingroup/treatment/category j.

■ µ is the population (grand) mean.

■ τj is the effect of group/treatment/category j.

■ ǫij is the error associated with the score of individual i ingroup/treatment/category j.

Page 48: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 47 of 54

The Fixed Effects Linear Model

■ The fixed effects linear model states that a predicted scorefor an observation is a composite of the grand mean andthe treatment effect of the group to which the observationbelongs.

Yij = µ + τj + ǫij

■ For all category levels (total represented by G), the modelhas the following constraint:

G∑

g=1

τg = 0

Page 49: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 48 of 54

The Fixed Effects Linear Model

■ This constraint means that the effect for the omittedcategory level (o) is equal to:

τo =∑

g 6=o

−τg = −τ1 − τ2 . . .

■ From the example, the effect for the control group is equalto:

τC = −τE = −2

■ Just to verify:

τE + τC = 2 + (−2) = 0

Page 50: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 49 of 54

Hypothesis Test of the Regression Coefficient

■ Because each model had the same value for R2 and thesame number of degrees of freedom for the regression (1),all hypothesis tests of the model parameters will result inthe same value of the test statistic.

F =R2/k

(1 − R2)/(N − k − 1)=

0.4/1

(1 − 0.4)/(10 − 1 − 1)= 5.33

■ From Excel (“=fdist(5.33,1,8)”), p = 0.0496.

■ If we used a Type-I error rate of 0.05, we would reject thenull hypothesis, and conclude the regression coefficient foreach analysis would be significantly different from zero.

Page 51: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 50 of 54

Concept

■ The thing to notice is that if the Null hypothesis is correct(i.e., all equal variances) then we have two ways toestimate the population variance.

1. We have a total of k samples and we know that for anygiven ith sample, its variance s2

i is an estimate of thepopulation variance. So we could estimate thepopulation variance by simply pooling together thevariances WITHIN each group (i.e., compute andaverage variance)

2. On the other hand we have k groups and for each groupwe have a mean yi. This is a sampling distribution. Weknow that the variance of the means should be σ2/n. Sowe could estimate the variance BETWEEN each of thegroups and multiple it by n to get an estimate

Page 52: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 51 of 54

Computation

■ The important thing that we use for our hypothesis test isthat number (2) will increase if there are actual differencesin the group means.

■ How did we estimate these computationally?

1. WITHIN estimate of Variance (s2e):

s2e =

∑k

i=1

∑n

j=1(xij − xi.)2

k(n − 1)=

SSE

k(n − 1)= MSE

2. BETWEEN estimate of Variance (ns2y):

ns2x =

n∑k

i=1(xi. − x..)2

k − 1=

SST

k − 1= MST

Page 53: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

MANOVA

Univariate ANOVA

Assumptions

Example

Example: Effect Coded

Example 1

Fixed Effects Linear Model

Hypothesis Test

Concept

Computation

Wrapping Up

Lecture #6 - 9/28/2005 Slide 52 of 54

Computation

■ Now given our two estimates of variance we simply tookthe ratio

F =MST

MSE=

SSTk−1SST

k(n−1)

■ Where our F-statistic, assuming the Null hypothesis is true,follows an F-Distribution with degrees of freedom equal tok − 1 and k(n − 1).

■ If there are true differences BETWEEN our group meansthen MST will over-estimate the variance and therefore Fwill be larger than we expect (i.e., it would be too extreme)and we would reject)

Page 54: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

Wrapping Up

Final Thought

Next Class

Lecture #6 - 9/28/2005 Slide 53 of 54

Final Thought

■ Today we covered themultivariate analog oftwo-sample t-tests.

■ Next week we will discoverhow these test areincorporated into theMANOVA framework.

■ Please use the ANOVA section of this course as arefresher for topics we will cover next time.

Page 55: Comparisons of Several Multivariate Means

Overview

Tests for Two Mean Vectors

1-Way ANOVA

Wrapping Up

Final Thought

Next Class

Lecture #6 - 9/28/2005 Slide 54 of 54

Next Time

■ MANOVA.

■ Profile analysis.

■ Growth Curves.

■ Beginning of multivariate multiple regression.