canonical correlation/regression. aka multiple, multiple regression aka multivariate multiple...

22
Canonical Correlation/Regression

Upload: hector-greer

Post on 04-Jan-2016

256 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Canonical Correlation/Regression

Page 2: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Canonical Correlation/Regression

• AKA multiple, multiple regression• AKA multivariate multiple regression• Have two sets of variables (Xs and Ys)• Create a pair of canonical variates

– a1X1 + a2X2 + .... + apXp , and

– b1Y1 + b2Y2 + .... + bqYm

• Such that the correlation between the canonical variates is as large as possible.

Page 3: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Extract a Second Pair

• From the (residual) variance not captured in the first pair, create a second pair of canonical variates.

• Continue like this, creating additional pairs of canonical variates, until– You reach a cutoff threshold based on the

significance of the correlation– You have created the max number of pairs,

which = the smaller of m or p

Page 4: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

What is a Canonical Variate?

• A weighted linear combination of variables• You can think of it as

– Something (a superordinate variable) you have created from several variables, or

– An estimate of an construct, a latent variable, a dimension that causes variance in the observed variables.

Page 5: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

What is a Canonical Correlation?

• It is the simple correlation between one canonical variate of the X variables and the corresponding canonical variate of the Y variables.

• We can test the significance of the first canonical correlation all by itself (Roy’s maximum root).

• And the set of canonical correlations from the nth through the >nth.

Page 6: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Eigenvalues

• For each pair of canonical variates, the eigenvalue is equal to the ratio of the squared canonical correlation (explained variance in the canonical variate) to one minus the squared canonical correlation (unexplained variance in the canonical variate).

Page 7: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Eigenvalues

• An eigenvalue of 1 would be obtained if the squared canonical correlation was .5 – the proportion of variance explained would be equal to the proportion of variance not explained.

Page 8: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Eigenvalues

• An eigenvalue of 1/3 would be obtained if the squared canonical correlation was .25 – the proportion of unexplained variance would be three times the proportion of explained variance.

• An eigenvalue of 3 would be obtained if the squared canonical correlation was .75 – the proportion of explained variance would be three times the proportion of unexplained variance.

Page 9: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Redundancy

• We measure how much variance each canonical variance extracts from its own set of variables

• And how much variance each canonical variate explains in the variables of the other canonical variate.

Page 10: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

R-Squared

• We also obtain, for each variable, the r2 for predicting that variable from the first canonical variate of the other set of variables

• And the R2 for predicting that variable from the first and second canonical variates of the other set of variables

• And so on.

Page 11: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Patel, Long, McCammon, & Wuensch (1995)

• Male college students• Xs = Personality variables (MMPI)

– PD (psychopathically deviant, Scale 4) – social maladjustment and hostility

– MF (masculinity/femininity, Scale 5) – in men, low scores = stereotypical masculinity

– MA (hypomania, Scale 9) – overactivity, flight of ideas, low frustration tolerance, narcissism, irritability, restlessness, hostility, and difficulty with controlling impulses

– Scale K (clinical defensiveness) – low scores = unusually frank.

Page 12: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Ys: Homonegativity Variables

• IAH (Index of Attitudes Towards Homosexuals) – Affective component of “homophobia,”

disgust.– High scores – discomfort around

homosexuals• SBS (self-report behavior scale)

– Past negative actions towards male homosexuals

– High score – high frequency of such actions.

Page 13: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

What is This Thing I Have Created or Discovered?

• Look at the standardized weights used to construct the canonical variate.

• Even better, look at the loadings– Compute, for each case, a score on the

canonical variate.– Correlate those scores with scores on the

original variables in its set.

Page 14: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Weights

MMPI_1Femininity -.61Scale K -.60Psycho. Dev. .43Hypomania .46

Homoneg_1SBS .93IAH .15

Being stereotypically masculine, unusually frank, psycho. deviant, and hypomanic is associated with acting negatively towards gays.

Page 15: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Loadings

MMPI_1 = Johnny PissoffScale K -.53Hypomania .53Femininity -.49Psycho. Dev. .32

Homoneg_1 = Aggressive HomophobiaSBS .99IAH .52

Being unusually frank, hypomanic, stereotypically masculine, and psycho. deviant, is associated with being uncomfortable around and acting negatively towards gays.

Page 16: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

Weights or Loadings?

• Like the Beta weights in a multiple regression, the weights for a canonical variate can be deceptive.

• If two variables within a set are well correlated with each other, one or both weights may be artificially low.

• I generally prefer to interpret loadings.

Page 17: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

A Second Pair of Canonical Variates

• There likely is variance in the variables that was not “captured” by the first pair of canonical variates.

• We can create a second pair, orthogonal to the first, from that residual variance.

• The number of pairs of canonical variates we can create is equal to the number of variables in the smaller set.

Page 18: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Second Pair of WeightsMMPI_2Femininity .70Hypomania .67Psycho. Dev. -.09Scale K -.04

Homoneg_2IAH -1.08SBS .57

Being unusually feminine and hypomanic is associated with not being uncomfortable around gays but acting negatively towards them anyhow.

Page 19: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Equal Opportunity Bully

• What are we to make of “not being uncomfortable around gays but acting negatively towards them anyhow.”

• One student called this “the equal opportunity bully.”

• He acts negatively towards everybody, gay or straight.

Page 20: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Second Pair of LoadingsMMPI_2 = Feminine HypomaniaFemininity .76Hypomania .72Psycho. Dev. .21Scale K -.08Psycho. Dev. .21Homoneg_2 = Equal Opportunity BullyIAH -.85SBS .14

Being unusually feminine and hypomanic is associated with not being uncomfortable around gays.

Page 21: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

The Canonical Correlations

• Compute canonical variate scores for each case.

• Correlate each with its pairmate.• Will always be highest for first pair, lower

for each subsequent pair.• Here, the canonical corrs are .38 and .32.• Both were statistically significant.

Page 22: Canonical Correlation/Regression. AKA multiple, multiple regression AKA multivariate multiple regression Have two sets of variables (Xs and Ys) Create

SAS

• It is now time to jump to the annotated output.