pearson correlation simple linear regression chi-square
TRANSCRIPT
![Page 1: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/1.jpg)
By
Dr. Mojgan Afshari
Statistical Techniques to Explore Relationships among Variables
![Page 2: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/2.jpg)
Agenda
� Pearson Product-Moment Correlation
� Simple Liner Regression
� Chi-Square Test for Independence
n2
![Page 3: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/3.jpg)
Pearson Product-Moment Correlation
� Purpose – determine relationshipbetween two metric variables
� Requirement:DV -Interval/RatioIV -Interval/Ratio
n3
![Page 4: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/4.jpg)
Assumptions1. Level of measurement( IV and DV should be interval/ ratio)2. Independence of observations3. Normality4. Linearity� The relationship between the two variables should be linear.
This means that when you look at a scatterplot of scoresyou should see a straight line (roughly), not a curve.
5. Homoscedasticity� The variability in scores for variable X should be similar at
all values of variable Y. Check the scatterplot. It shouldshow a fairly even cigar shape along its length.
![Page 5: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/5.jpg)
Components of Pearson r analysis
Choose it
![Page 6: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/6.jpg)
Note
� Before performing a correlation analysis, it is agood idea to generate a scatterplot. Thisenables you to check for violation of theassumptions of linearity and homoscedasticity.
� Inspection of the scatterplots also gives you abetter idea of the nature of the relationshipbetween your variables.
n6
![Page 7: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/7.jpg)
Scatter plot/ Scatter gram
Strong positive
correlation
Strong negative
correlation
Weak positive
correlation
Weak negative
correlationX
Y
X
XX
Y
Y
Y
![Page 8: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/8.jpg)
n8
![Page 9: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/9.jpg)
� Pearson correlation coefficients (r) can onlytake on values from -1 to + l.
� The sign out the front indicates whether there isa positive correlation (an increase in X will alsoincrease in the value of Y) or a negativecorrelation (as one variable increases, the otherdecreases).
� The size of the absolute value provides anindication of the strength of the relationship.
![Page 10: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/10.jpg)
� A perfect correlation of 1 or -1 indicates thatthe value of one variable can be determinedexactly by knowing the value on the othervariable. A scatterplot of this relationshipwould show a straight line.
� A correlation of 0 indicates no relationshipbetween the two variables.
n10
![Page 11: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/11.jpg)
1. Descriptive
n11
![Page 12: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/12.jpg)
n12
![Page 13: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/13.jpg)
2. Inferential
n13
Hypothesis Test
![Page 14: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/14.jpg)
Excel
![Page 15: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/15.jpg)
n15
2. Set confidence intervalGenerally, confident level is set at .05
![Page 16: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/16.jpg)
n16
![Page 17: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/17.jpg)
n17
![Page 18: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/18.jpg)
n18
It can be concluded that there is not significant relationship between IV and DV at 0.05 level of
significance
It can be concluded that there is significant relationship between IV and DV at 0.05 level of
significance
![Page 19: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/19.jpg)
![Page 20: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/20.jpg)
1. Data were collected from a randomly selected sampleto determine relationship between average assignmentand test scores in statistics. Distribution for the data ispresented in the table below. Assuming the data arenormally distributed,
1) Plot a scatter diagram to represent the followingpair of scores of the two variables.
2) Calculate an appropriate correlation coefficient,3) Describe the nature of relationship between the two
variables, and4) Test the hypothesis on the relationship
at .01 level of significance.
Exercise
![Page 21: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/21.jpg)
2) Explain the following concept. You may usegraphs to illustrate each concept
a) Perfect positive linear correlationb) Perfect negative linear correlationc) Strong positive linear correlationd) Strong negative linear correlatione) Weak positive linear correlationf) Weak negative linear correlation g) No linear correlation
![Page 22: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/22.jpg)
3) For a sample data set, the linear correlation coefficient r has a positive value.Which of the following is true about the slope b of the regression line estimated for the same sample data?
a) The value of b will be positiveb) The value of b will be negativec) The value of b can be positive or negative
n22
![Page 23: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/23.jpg)
� 3) The data on ages (in years) and prices (in hundred of dollars for eight cars of a specific model) are shown below:Age: 8 3 6 9 2 5 6 3
Prices: 18 94 50 21 145 42 36 99
a) Do you expect the ages and prices of cars to bepositively or negatively related? Explain.b) Calculate the linear correlation coefficient.c) Test at the 5% significance level whether ρ is
negative
![Page 24: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/24.jpg)
4) The following table lists the Advertising and Marketing scores for 7 students in a statistics class.� Advertising score: 79 95 81 66 87 94 59� Marketing score: 85 97 78 76 94 84 67
a) Do you expect the Advertising and Marketing scores to bepositively or negatively correlated?
b) Plot a scatter diagram. By looking at the scatter diagram, do you expect the correlation coefficient between these 2 variables to be close to zero, 1, or -1.c) Find the correlation coefficient. Is the value of r consistent with what you expected in parts a and b?d) Using the 1% significance level, test whether the linear correlation coefficient is Positive
![Page 25: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/25.jpg)
X= PriceY= Sale
5)
![Page 26: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/26.jpg)
n26
Simple Linear Regression
![Page 27: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/27.jpg)
IntroductionPurpose� To determine relationship between IV and DV� To predict value of the dependent variable (Y)
based on value of independent variable (X)� To assess how well the dependent variable can
be explained by knowing the value of the independent variable
Requirement� Scales of measurement for variables:DV -interval or ratioIV -interval or ratio
![Page 28: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/28.jpg)
Concepts of
Simple Linear Regression
n28
![Page 29: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/29.jpg)
Descriptive
n29
![Page 30: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/30.jpg)
Bases for Best Fit Line� Use Scatter Plot to identify the line of best fit� The line is also called the least squares
regression line� The purpose of this line is to show the overall
trend or pattern in the data and to allow the reader to make predictions about future trends in the data.
n30
![Page 31: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/31.jpg)
Prediction equation
Regression Model involves two statistics andparameters, namely: Intercept and slope ofthe regression line.
![Page 32: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/32.jpg)
Inferential
n32
![Page 33: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/33.jpg)
�Evaluation of the regression model(prediction equation) and slope of theregression line involve testing ofhypothesis.
�The testing of hypothesis for predictionequation involves the F test while thetesting of slop entails t test
n33
![Page 34: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/34.jpg)
n34
![Page 35: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/35.jpg)
� The purpose of evaluating the regression model or the prediction equation is to determine whether the model really fits the data.
![Page 36: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/36.jpg)
n36
![Page 37: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/37.jpg)
The purpose to test hypothesis regarding theregression slope is to determine the significance ofthe relationship between IV and DV.
![Page 38: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/38.jpg)
ExampleData were collected from a randomly selectedsample to determine relationship between averageassignment scores and test scores in statistics.Distribution for the data is presented in the tablebelow.1.Calculate b1and b0 and derive the prediction equation2.Test the hypothesis for the regression model at α= .053.What the values of coefficient of determination ( R2) and multiple correlation coefficient ( r). Interpret the two values.4.Test hypothesis for the slope at .05
![Page 39: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/39.jpg)
Chi-SquareTest for Independence
![Page 40: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/40.jpg)
Chi-square test for independence
� Explore the association between twocategorical variables. Each of these variablescan have two or more categories.
n40
Independent variable(Nominal/ Ordinal)
Dependent variable(Nominal/ Ordinal)
![Page 41: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/41.jpg)
Example
� To explore the association between Gender ( Male / Female) and Smoking Behaviour ( Smoker/Non-Smoker).
� IV: Gender ( Male/ Female)DV: Smoking Behaviour ( Smoker / Non Smoker)
n41
![Page 42: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/42.jpg)
� Research questions:There are a variety of ways questions can bephrased:
1. Is there an association between gender andsmoking behaviour?
2. Are males more likely to be smokers thanfemales?
3. Is the proportion of males that smoke thesame as the proportion of females?
n42
![Page 43: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/43.jpg)
Minimum Expected Cell Frequency Assumption
The lowest expected frequency in any cell should be5 or more. Some authors suggest less stringentcriteria: at least 80 per cent of cells should haveexpected frequencies of 5 or more.
We could not use"# for the following case:
n43
OBSERVED EXPECTEDAGREE 247 255
DISAGREE 6 4
![Page 44: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/44.jpg)
What to Expect?H
ypot
hesi
s Te
st
StateH0 and HA
Calculate $2
1
3
Critical Value
2
Decision Conclusion
54
Measures of association
●Phi●Contingency● Cramer’s V
![Page 45: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/45.jpg)
Step in Testing Hypothesis
1. State the null and alternative hypothesesHO: DV is independent of IVHA: DV is dependent on IV
2. Calculate the test statistic$2 value� Contingency table –two variables� Calculation based on:
O-Observed frequencyE-Expected frequency
![Page 46: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/46.jpg)
� Summary of the table to calculate the chi-square statistics
$2
![Page 47: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/47.jpg)
3. Determine critical value•α•df= (R –1) (C –1)
4.Make your decision5.Make conclusion
![Page 48: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/48.jpg)
Effect Size� To determine the strength and magnitude of the
association between two variables.Effect size statistics: � Phi coefficient (2 by 2 tables)� Cramer's V (Tables larger than 2 by 2)� Contingency coefficient ( Tables larger than 2 by 2)
n48
Contingency
![Page 49: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/49.jpg)
n49
![Page 50: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/50.jpg)
Example 1:� A study was conducted to test the association
between Firm size and cloud computing adoption.Data collected from a randomly selected samplefollow.
� 1. Test the hypothesis on the association between thetwo variables at .01 level of significance.
� 2. Calculate and describe an appropriate measure ofassociation between the two variables.
n50
![Page 51: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/51.jpg)
1. Hypotheses testinga. State H0 and HA
� H0: Cloud computing adoption is independent of Firm size� HA: Cloud computing adoption is dependent on Firm size
b. Test statisticCalculate expected value for each cell:
![Page 52: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/52.jpg)
n52
Group
![Page 53: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/53.jpg)
![Page 54: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/54.jpg)
![Page 55: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/55.jpg)
a. State H0 and HA
![Page 56: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/56.jpg)
![Page 57: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/57.jpg)
![Page 58: Pearson Correlation Simple Linear Regression Chi-Square](https://reader031.vdocuments.site/reader031/viewer/2022022217/62148a800be6003f2124d7e8/html5/thumbnails/58.jpg)
n58