what is a pearson product moment correlation (independence)?
DESCRIPTION
What is a Pearson Product Moment Correlation (independence)?TRANSCRIPT
Pearson Product Moment Correlation for Independence Questions
The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between or independence of two variables that are continuous.
The Pearson Product Moment Correlation is the most widely used statistic when determining the relationship between or independence of two variables that are continuous.
Variable A Variable B
By continuous we mean a variable that can take any valuable between two points.
By continuous we mean a variable that can take any valuable between two points.Here is an example:
By continuous we mean a variable that can take any valuable between two points.Here is an example:
Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.
By continuous we mean a variable that can take any valuable between two points.Here is an example:
Suppose the fire department mandates that all fire fighters must weigh between 150 and 250 pounds. The weight of a fire fighter would be an example of a continuous variable; since a fire fighter's weight could take on any value between 150 and 250 pounds.
The Pearson Product Moment Correlation will either indicate a strong relationship
The Pearson Product Moment Correlation will either indicate a strong relationship
Variable A Variable B
Or a weak even nonexistent relationship
Or a weak even nonexistent relationship
Variable A Variable B
Or a weak even nonexistent relationship
Variable A Variable B
This is what is meant by independence!
The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to +1.0
The Pearson Product Moment Correlation or simply Pearson Correlation values range from -1.0 to +1.0
-1 +10
A Pearson Correlation of 1.0 has a perfect positive relationship. Note two qualities here:
A Pearson Correlation of 1.0 has a perfect positive relationship. Note two qualities here:
(1) direction
A Pearson Correlation of 1.0 has a perfect positive relationship. Note two qualities here:
(1) direction(2) strength
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.A 0.0 Pearson Correlation has no direction and has no strength.
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.A 0.0 Pearson Correlation has no direction and has no strength.
This is would be evidence of independence between two
variables
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.A 0.0 Pearson Correlation has no direction and has no strength. A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.
A Pearson Correlation of 1.0 has a perfect postive relationship. Note two qualities here:
(1) direction(2) strength
A +1.0 Pearson Correlation’s direction is positive and it’s strength is very or perfectly strong.A -1.0 Pearson Correlation’s direction is negative and it’s strength is very or perfectly strong.A 0.0 Pearson Correlation has no direction and has no strength. A +0.3 Pearson Correlation’s direction is positive and it’s strength is moderately weak.A -0.1 Pearson Correlation’s direction is negative and it’s strength is very weak.
There is another quality as well. With a Pearson correlation you are considering the relationship between or independence of only two variables.
There is another quality as well. With a Pearson correlation you are considering the relationship between or independence of only two variables.
There is another quality as well. With a Pearson correlation you are considering the relationship between or independence of only two variables.
Three’s a crowd:
There is another quality as well. With a Pearson correlation you are considering the relationship between or independence of only two variables.
Three’s a crowd:
There is another quality as well. With a Pearson correlation you are considering the relationship between or independence of only two variables.
Three’s a crowd:
Bottom line: The Pearson Correlation is used only when exploring the relationship between or independence of two variables.
Let’s look at a fictitious problem to illustrate how the Pearson Correlation is calculated.
Imagine you are working for a company that is trying convince patrons that ice-cream is a dessert for all seasons. They ask you to conduct a study to determine if the average daily temperature and the average daily ice cream sales are independent of one another.
Imagine you are working for a company that is trying convince patrons that ice-cream is a dessert for all seasons. They ask you to conduct a study to determine if the average daily temperature and the average daily ice cream sales are independent of one another.
Imagine the data set looks like this:
Imagine the data set looks like this:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Notice how as one variable goes up (temperature) the other variable increases (ice cream sales)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
One way to look at this relationship is to rank order both variable values like so:
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd 2nd
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
3rd 3rd
2nd
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
3rd 3rd
2nd
4th 4th
One way to look at this relationship is to rank order both variable values like so:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with higher
values for another variable
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Or
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that lower values for one variable are associated with lower
values for another variable
Notice how their rank orders are identical. And because their standard deviations are similar as well, these variables have a +1.0 Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st 1st
2nd
5th 5th
4th 4th
3rd 3rd
2nd
Meaning that lower values for one variable are associated with lower
values for another variable
What would a perfectly negative correlation (-1.0) look like?
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with lower
values for another variable
What would a perfectly negative correlation (-1.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
230
320
350
480
560
1st
1st
2nd
5th
5th
4th
4th
3rd 3rd
2nd
Meaning that higher values for one variable are associated with lower
values for another variable
What would a zero correlation (0.0) look like?
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
What would a zero correlation (0.0) look like?
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
1st
1st
2nd
5th 5th
4th
4th
3rd
3rd
2nd
If this is the result than we can conclude that temperature and ice
cream are independent of one another
The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC
The Pearson Product Moment Correlation (PPMC) is calculated as the average cross product of the z-scores of two variables for a single group of people. Here is the equation for the PPMC
𝑟=∑(𝑍 𝑋 ∙𝑍𝑌 )𝑛
Let’s calculate the Pearson Correlation, for the following data set:
Let’s calculate the Pearson Correlation, for the following data set:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Let’s calculate the Pearson Correlation, for the following data set:
It is important to note that the Pearson Correlation can be computed in a matter of seconds using statistical software. The next set of slides is designed to help you see what is happening conceptually as well as computationally with the Pearson Correlation.
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
The metric here is degrees
When computing a Pearson Correlation you will normally have two variables that DO NOT USE THE SAME METRIC:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
The metric here is number of ice
cream sales
The metric here is degrees
So we have to get these two variables on the same metric. This is done by calculating the z scores or standardized scores for the values from each variable.
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
Different Metric (raw scores)
So these raw score values in separate metrics are transformed into standardized values which converts them into the same metric:
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
Same Metric (z or standard
scores)
• Note – this is done by subtracting each value from it’s mean (e.g., 900 minus 700 = 200) and dividing it by it’s standard deviation (e.g., 200 / 14.1 = 1.4)
Ave Daily Temp
900
800
700
600
500
Ave Daily Ice Cream Sales
560
480
350
320
230
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
• Once the values are standardized we multiply them
• Once the values are standardized we multiply them
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
These are called cross products because we are multiplying
across two values
• Once the values are standardized we multiply them
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
XXXXX
Cross Products
1.9
0.4
0.0
0.6
2.1
=====
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
1.9 + 0.4 + 0.0 + 0.6 + 2.1 = 5.0Then we sum the cross products
• Finally, divide that number (5.0) by the number of observations
• Finally, divide that number (5.0) by the number of observations
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
• Finally, divide that number (5.0) by the number of observations
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝑛
The number of observations (in this case 5)
Ave Daily Temp
+1.4
+0.7
0.0
-0.7
-1.4
Ave Daily Ice Cream Sales
+1.5
+0.8
-0.3
-0.6
-1.3
12345
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=+𝟏 .𝟎
𝑟=∑(𝒁 𝑿 ∙𝒁𝒀 )
𝟓
The number of observations (in this case 5)
𝑟=𝟓𝟓
Sum of the cross products1.9 + 0.4 + 0.0 + 0.6 + 2.1 =
5.0
𝑟=+𝟏 .𝟎This is the Pearson Correlation which in this case is a perfect
positive relationship
In summary:
In summary:The Pearson Product Moment Correlation can range from -1 to 0 to +1.
In summary:The Pearson Product Moment Correlation can range from -1 to 0 to +1.
-1 +10
A correlation of 0.0 indicates no association between the variables of interest, hence independence.