linear regression
DESCRIPTION
Linear Regression. William P. Wattles, Ph.D. Psychology 302. Correlation. - PowerPoint PPT PresentationTRANSCRIPT
Linear Regression
William P. Wattles, Ph.D.Psychology 302
Correlation• Teen birth rate correlated
with our composite religiosity variable with r = 0.73; 95% CI (0.56,0.84); n = 49; p < 0.0005. Thus teen birth rate is very highly correlated with religiosity at the state level, with more religious states having a higher rate of teen birth. A scatter plot of teen birth rate as a function of religiosity is presented in Figure 1. http://www.reproductive-health-journal.com/content/6/1/14
• “Victor, when will you stop trying to remember and start trying to think?” --Helen Boyden
• You can use linear regression to answer the following questions about the pattern of data points and the significance of a linear equation:• 1. Is a pattern evident in a set of data
points?• 2. Does the equation of a straight line
describe this pattern?• 3. Are the predictions made from this
equation significant?
• Using Regression to predict college performance and college satisfaction.
Dependent and Independent Variables
• Dependent Variable-or Criterion Variable The variable whose variation we want to explain.
• Independent Variable-or Predictor Variable A variable that is related to or predicts variation in the dependent variable.
Examples• SAT score, college GPA• Alcohol consumed, score on a
driving test• type of car, Qualifying speed • level of education, Income• Number of boats registered,
deaths of manatees
Correlation• The relationship between two
variables X and Y.• In general, are changes in X
associated with Changes in Y?• If so we say that X and Y covary.• We can observe correlation by
looking at a scatter plot.
Correlation example• Is number of
beers consumed associated with blood alcohol level?
16
Beer consumption and Blood Alcohol Content
Correlation• Correlation coefficient tells us the
strength and direction of the relationship between two variables.
Prediction• If two variables
are related then knowing a value for one should allow us to predict the value of the other.
Regression• Allows us to
predict one variable based on the value of another.
Regression• Using knowledge of the
relationship between X and Y to predict Y given X.
• X the independent variable (predictor) used to explain changes in Y
• Y the dependent variable (criterion)
Linear regression• Regression line-a straight line
through the scatter plot that best describes the relationship.
• Regression line-predicts the value of Y for a given value of X.
Regression Line• A straight line that describes how a
dependent variable changes as the independent variable changes.
Least squares regression.• A method of determining the
regression line that minimizes the errors (residuals)
Least squares regression• residual is the error or the
amount that the observed observation deviates from the regression line.
• goal to find a solution that minimizes the squared residuals
• Least squares (the smallest possible sum of the squared residuals)
Least squares regression.• a is the intercept the value of y
when X=0• b is the slope the rate of change in
Y when X increases by 1
Regression formula• a=Ybar-bXbar• b=sum of deviation products/sum
of Xdev squared
Berk & Cary page 240Mortality vs. TemperatureBerk & Carey Page 303 y = 2.3577x - 21.795
R2 = 0.7654
50.0
60.0
70.0
80.0
90.0
100.0
110.0
30.0 35.0 40.0 45.0 50.0 55.0temperature
mor
talit
y in
dex
Simple Linear Regression
y a bx
The Regression Equation• x-the independent variable, the
predictor• y-the dependent variable, what we
want to predict• a-the intercept• b-the slope
Calculating the least-squares regression.
bx x y y
x x
( )( )
( ) 2
a y bx
Population
Population
Sample
β Beta Slopeα Alpha Intercept
b Slopea Intercept
• Crying and IQ page 600
Relationship• The scatterplot suggests a
relationship between crying and IQ.
• Can use knowledge of crying to predict IQ
• What would null say?
Null, says: “It’s nothing but sampling error.
HO
Ha• Babies who cry
easily may be more easily stimulated and have higher IQ’s
Steps to Analyze Regression Data
• Plot and interpret• Numerical
summary• Mathematical
model
Plot and Interpret• Plot independent
variable on the X axis
• Plot dependent variable on the Y axis.
• Examine form, direction and strength of relationship
Numerical Summary• Correlation coefficient tells
direction and strength of relationship.
r = +.455
r squared• r 2 percent of
variance in Y explained by X.
• =21%
Mathematical Model• Use model to predict IQ based on
knowledge of crying• Least Squares regression line.• Y predict=a + bx
• a(the intercept) =91.27• b the slope = 1.493
y
Excel Output
Sample Statistics• The slope and intercept are
statistics because they are calculated on the sample.
• We are really interested in estimating the population parameters
PopulationParameter
Sample Statistic
Residuals• Residuals-The difference between
the observed value of the dependent variable and and value predicted by the regression line.
residual y y
Coefficient of determination
• R2 the square of the correlation coefficient.
• The amount of the variation in Y that can be explained by changes in X
Regression and correlation• correlation tells us about the
relationship• regression allows us to predict Y if
we know X
Serotonin• 5-HT levels
predict mood in healthy males.
• SSRI, Zoloft, Prozac
Privitera page 531• Do levels of
serotonin predict positive mood in subjects?
Exam 2 as predictor
Exam 1 as predictor
Using the regression equation
• Exam 1 84% • exam1pred 80.8% • Exam 2 68% • exam2 pred 69.3%
Non-exercise activity and weight gain
• Does appraised value predict selling price?
• Page 622.
Francis Marion Univ.• http://vimeo.com/
39111127
Final Exam• 81 questions• All multiple choice
• chi square• independent t-
test• matched pairs• regression• single-sample t-
test
Time at table
• Does time at the lunch table predict how much young children eat?
• Page 629.
Arctic Rivers
• page 604• do data suggest a
change in discharge over time?
• Page 630 does pine cone count predict number of offspring in squirrels?
Low variability
High Variability
The End