psychology 340 spring 2010

35
Prediction with multiple variables Statistics for the Social Sciences Psychology 340 Spring 2010

Upload: xiu

Post on 06-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Statistics for the Social Sciences. Prediction with multiple variables. Psychology 340 Spring 2010. Outline. Multiple regression Comparing models, Delta r 2 Using SPSS. Multiple Regression. Typically researchers are interested in predicting with more than one explanatory variable - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Psychology 340 Spring 2010

Prediction with multiple variables

Statistics for the Social SciencesPsychology 340

Spring 2010

Page 2: Psychology 340 Spring 2010

PSY 340Statistics for the

Social SciencesOutline

• Multiple regression– Comparing models, Delta r2

– Using SPSS

Page 3: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

• Typically researchers are interested in predicting with more than one explanatory variable

• In multiple regression, an additional predictor variable (or set of variables) is used to predict the residuals left over from the first predictor.

Page 4: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

Y = intercept + slope (X) + error

• Bi-variate regression prediction models

Page 5: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

• Multiple regression prediction models

μY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

“fit” “residual”

Y = intercept + slope (X) + error

μY = β0 + β1X + ε

• Bi-variate regression prediction models

Page 6: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

• Multiple regression prediction models

First

Explanatory

Variable

Second

Explanatory

Variable

Fourth

Explanatory

Variable

whatever variability

is left overμY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

Third

Explanatory

Variable

Page 7: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

First

Explanatory

Variable

Second

Explanatory

Variable

Fourth

Explanatory

Variable

whatever variability

is left overμY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

Third

Explanatory

Variable

• Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep

Page 8: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

• Predict test performance based on: • Study time • Test time • What you eat for breakfast • Hours of sleep

• Typically your analysis consists of testing multiple regression models to see which “fits” best (comparing r2s of the models)

μY = β0 + β1X1 + β2 X2 + ε

μY = β0 + β1X1 + β2 X2 + β 4 X4 + εversus

μY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + εversus

• For example:

Page 9: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

Response variableTotal variability it test performance

Total study timer = .6

Model #1: Some co-variance between the two variables

R2 for Model = .36

64% variance unexplained

• If we know the total study time, we can predict 36% of the variance in test performance

μY = β0 + β1X1 + ε

Page 10: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

Response variableTotal variability it test performance

Test timer = .1

Model #2: Add test time to the model

Total study timer = .6

R2 for Model = .49

51% variance unexplained

• Little co-variance between these test performance and test time• We can explain more the of variance in test performance

μY = β0 + β1X1 + β2 X2 + ε

Page 11: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

Response variableTotal variability it test performance

breakfastr = .0

Model #3: No co-variance between these test performance and breakfast food

Total study timer = .6

Test timer = .1

R2 for Model = .49

51% variance unexplained

μY = β0 + β1X1 + β2 X2 + β 3X3 + ε

• Not related, so we can NOT explain more the of variance in test performance

Page 12: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

Response variableTotal variability it test performance

breakfastr = .0

• We can explain more the of variance • But notice what happens with the overlap (covariation between explanatory

variables), can’t just add r’s or r2’s

Total study timer = .6

Test timer = .1

Hrs of sleepr = .45

R2 for Model = .60

40% variance unexplained

μY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

Model #4: Some co-variance between these test performance and hours of sleep

Page 13: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

Setup as before: Variables (explanatory and response) are entered into columns

• A couple of different ways to use SPSS to compare different models

Page 14: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Regression in SPSS

• Analyze: Regression, Linear

Page 15: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• Method 1:enter all the explanatory

variables together – Enter:

• All of the predictor variables into the Independent Variable field

• Predicted (criterion) variable into Dependent Variable field

Page 16: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the model

• r for the entire model

• r2 for the entire model

• Unstandardized coefficients

• Coefficient for var1 (var name)

• Coefficient for var2 (var name)

Page 17: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the model

• r for the entire model

• r2 for the entire model

• Standardized coefficients

• Coefficient for var1 (var name)

• Coefficient for var2 (var name)

Page 18: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression

– Which β to use, standardized or unstandardized?

– Unstandardized β’s are easier to use if you want to predict a raw score based on raw scores (no z-scores needed).

– Standardized β’s are nice to directly compare which variable is most “important” in the equation

Page 19: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• Predicted (criterion) variable into Dependent Variable field

• First Predictor variable into the Independent Variable field

• Click the Next button

• Method 2: enter first model, then add another variable for second model,

etc. – Enter:

Page 20: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• Method 2 cont: – Enter:

• Second Predictor variable into the Independent Variable field

• Click Statistics

Page 21: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

– Click the ‘R squared change’ box

Page 22: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the first model (math SAT)• Shows the results of two models

• The variables in the second model (math and verbal SAT)

Page 23: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the first model (math SAT)

• r2 for the first model

• Coefficients for var1 (var name)

• Shows the results of two models

• The variables in the second model (math and verbal SAT)

• Model 1

Page 24: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the first model (math SAT)

• Coefficients for var1 (var name)

• Coefficients for var2 (var name)

• Shows the results of two models

• r2 for the second model

• The variables in the second model (math and verbal SAT)

• Model 2

Page 25: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• The variables in the first model (math SAT)• Shows the results of two models

• The variables in the second model (math and verbal SAT)

• Change statistics: is the change in r2 from Model 1 to Model 2

statistically significant?

Page 26: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

• Multiple Regression

μY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

“residual”“fit”

– We can test hypotheses about the overall model

Page 27: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• Null Hypotheses• H0: University GPA is not

predicted by SAT verbal or SAT Math scores

• p < 0.05, so reject H0, SAT math and verbal predict University GPA

Page 28: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

First

Explanatory

Variable

Second

Explanatory

Variable

Fourth

Explanatory

Variable

μY = β0 + β1X1 + β2 X2 + β 3X3 + β 4 X4 + ε

Third

Explanatory

Variable

• Multiple Regression

– We can test hypotheses about each of these explanatory hypotheses within a regression model

• So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable

– We can test hypotheses about the overall model

Page 29: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Multiple Regression in SPSS

• Null Hypotheses• H0: Coefficient for var1 = 0

• p < 0.05, so reject H0, var1 is a significant predictor

• H0: Coefficient for var2 = 0

• p > 0.05, so fail to reject H0, var2 is a not a significant predictor

Page 30: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

• Multiple Regression

– We can test hypotheses about each of these explanatory hypotheses within a regression model

• So it’ll tell us whether that variable is explaining a “significant”amount of the variance in the response variable

– We can also use hypothesis testing to examine if the change in r2 is statistically significant

– We can test hypotheses about the overall model

Page 31: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

• The variables in the first model (math SAT)

• r2 for the first model

• Coefficients for var1 (var name)

• Shows the results of two models

• The variables in the second model (math and verbal SAT)

• Model 1

Page 32: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

• The variables in the first model (math SAT)

• Coefficients for var1 (var name)

• Coefficients for var2 (var name)

• Shows the results of two models

• r2 for the second model

• The variables in the second model (math and verbal SAT)

• Model 2

Page 33: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Hypothesis testing with Regression

• The variables in the first model (math SAT)• Shows the results of two models

• The variables in the second model (math and verbal SAT)

• Change statistics: is the change in r2 from Model 1 to Model 2

statistically significant?

The 0.002 change in r2

is not statistically

significant (p = 0.46)

The 0.002 change in r2

is not statistically

significant (p = 0.46)

Page 34: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Regression in Research Articles

• Bivariate prediction models rarely reported

• Multiple regression results commonly reported

Page 35: Psychology 340 Spring 2010

PSY 340Statistics for the

Social Sciences Cautions in Multiple Regression

• We can use as many predictors as we wish but we should be careful not to use more predictors than is warranted.– Simpler models are more likely to generalize to other

samples.– If you use as many predictors as you have participants in

your study, you can predict 100% of the variance. Although this may seem like a good thing, it is unlikely that your results would generalize to any other sample and thus they are not valid.

– You probably should have at least 10 participants per predictor variable (and probably should aim for about 30).