correlation examining relationships. five descriptive questions what is the middle of the set of...
TRANSCRIPT
CorrelationCorrelation examining relationships examining relationships
Five Descriptive QuestionsFive Descriptive Questions
What is the middle of the set of What is the middle of the set of scores?scores?
How spread out are the scores?How spread out are the scores? Where do specific scores fall in the Where do specific scores fall in the
distribution of scores?distribution of scores? What is the shape of the distribution?What is the shape of the distribution? How do different variables relate to How do different variables relate to
each other?each other?
CorrelationCorrelation
Once you know:Once you know:– MiddleMiddle– SpreadSpread– ShapeShape– Relative position of specific casesRelative position of specific cases
It is now useful to know It is now useful to know relationships between variables.relationships between variables.
CorrelationCorrelation
Direction of RelationshipsDirection of Relationships Positive or NegativePositive or Negative Magnitude of RelationshipsMagnitude of Relationships Weak , Moderate, Strong Weak , Moderate, Strong ScatterplotsScatterplots OutliersOutliers
CorrelationCorrelation
Quantitative index of associationQuantitative index of associationScaling of Pearson rScaling of Pearson r––1 = perfect negative relationship1 = perfect negative relationship0 = no relationship0 = no relationship+1 = perfect positive relationship+1 = perfect positive relationshipMost common measure of Most common measure of
association for interval and ratio association for interval and ratio variablesvariables
ExamplesExamples
Parent educational level and Parent educational level and student academic achievementstudent academic achievement
Parent income or SES and student Parent income or SES and student academic achievementacademic achievement
Coping strategies and perceived Coping strategies and perceived stressstress
CorrelationCorrelation
For positive correlations between For positive correlations between two variables:two variables:
High values on x tend to be High values on x tend to be associated with high values on yassociated with high values on y
Low values on x tend to be Low values on x tend to be associated with low values on yassociated with low values on y
High Positive Correlation, r=.825
30.00
40.00
50.00
60.00
70.00
30.00 40.00 50.00 60.00 70.00
Curriculum
To
tal S
core
GABIRTH
50403020
WE
IGH
T
5000
4000
3000
2000
1000
0
GAOBS
50403020
WE
IGH
T
5000
4000
3000
2000
1000
0
r=.337 2001-2002 NC State System Level Datar=.337 2001-2002 NC State System Level Data
FRL
908070605040302010
TU
RN
OV
ER
40
30
20
10
0
CorrelationCorrelation
For negative correlations between For negative correlations between two variables:two variables:
Low values on x tend to be Low values on x tend to be associated with high values on yassociated with high values on y
High values on x tend to be High values on x tend to be associated with low values on yassociated with low values on y
Percieved Control
9080706050403020
PS
S t
ota
l
60
50
40
30
20
10
r=-.613r=-.613
r=-.716 2001-2002 NC State System Level Datar=-.716 2001-2002 NC State System Level Data
FRL
908070605040302010
EO
G
100
90
80
70
60
50
40
r=-.560 2001-2002 NC State System Level Datar=-.560 2001-2002 NC State System Level Data
TURNOVER
403020100
EO
G
100
90
80
70
60
50
40
Interpretation GuidelinesInterpretation Guidelines
Correlation is not causality. Correlation is not causality.
Correlation is necessary for causal Correlation is necessary for causal inference, but not sufficient.inference, but not sufficient.
Causal inference requires Causal inference requires experimental designs.experimental designs.
Interpretation GuidelinesInterpretation Guidelines
Rum use and number of people Rum use and number of people entering the priesthood. entering the priesthood.
Square footage of home and Square footage of home and student academic achievement.student academic achievement.
Percent of women in a state who Percent of women in a state who earn high salaries and percent of earn high salaries and percent of public officials who are women.public officials who are women.
Interpretation GuidelinesInterpretation Guidelines
The third variable problem.The third variable problem.– SES and home size.SES and home size.
The risk factor vs. causal agent problem.The risk factor vs. causal agent problem.– Length of time smoking and life Length of time smoking and life
expectancy.expectancy.
The direction of causality problem.The direction of causality problem.– Productivity and job satisfactionProductivity and job satisfaction
Interpretation GuidelinesInterpretation Guidelines
R assumes a linear relationship. R assumes a linear relationship. R will underestimate curvilinear R will underestimate curvilinear
relationships.relationships.Restriction of range will lower Restriction of range will lower
correlation.correlation.Outliers, gaps in distributions, non-Outliers, gaps in distributions, non-
normal distributions can all influence r.normal distributions can all influence r.Be aware of subgroups.Be aware of subgroups.
Interpretation GuidelinesInterpretation Guidelines
Examine the scatterplot. Examine the scatterplot.
Examine the distributions of both Examine the distributions of both variables. variables.
Be aware of the other descriptive Be aware of the other descriptive statistics on both variables. statistics on both variables.
Interpreting MagnitudeInterpreting Magnitude
Strong Moderate Weak Weak Moderate Strong
-1.0 -0.7 -0.3 0.0 0.3 0.7 1.0
Perfect No PerfectNegative Relationship Positive
OutliersOutliers
You can look at outliers in the You can look at outliers in the univariate case (within the univariate case (within the distribution of a single variable) and distribution of a single variable) and in the bivariate case (within the in the bivariate case (within the scatterplot of points representing scatterplot of points representing values on two variables).values on two variables).
Examine the scatterplots for values Examine the scatterplots for values out of the pattern.out of the pattern.
GAOBS
50403020
GA
BIR
TH
50
40
30
20
AGEDAYS
140120100806040200-20
WE
IGH
T
5000
4000
3000
2000
1000
0
GABIRTH
50403020
AG
ED
AY
S
140
120
100
80
60
40
20
0
-20
WEIGHT
500040003000200010000
NP
BA
SE
HR
200
180
160
140
120
100
What would you expect?What would you expect?
Teacher ageTeacher ageClassroom qualityClassroom quality
20
25
30
35
40
45
50
55
60
65
70
30.00 40.00 50.00 60.00 70.00
Total Score
Ag
e o
f T
ea
ch
er
r=-.279r=-.279
What would you expect?What would you expect?
Perceived stressPerceived stressDepressionDepression
r=.582r=.582
BDI Total
50403020100-10
PS
S t
ota
l
60
50
40
30
20
10
What would you expect?What would you expect?
DepressionDepressionSelf-acceptanceSelf-acceptance
r=-.596r=-.596
Self-Acceptance
807060504030
BD
I T
ota
l
50
40
30
20
10
0
-10
What would you expect?What would you expect?
Emotional ExhaustionEmotional ExhaustionDepersonalizationDepersonalization
r=.574r=.574