political science 209 - fall 2018 - linear...
TRANSCRIPT
![Page 1: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/1.jpg)
Political Science 209 - Fall 2018
Linear Regression
Florian Hollenbach
22nd October 2018
![Page 2: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/2.jpg)
In-class Exercise Linear Regression
Please dowload intrade08.csv & pres08.csv from class website
• Read both data sets into R
• Create data summary for each data sets
Florian Hollenbach 1
![Page 3: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/3.jpg)
Variables in the intrade data
• day: Date of the session• statename: Full name of each state (including District of
Columbia in 2008)• state: Abbreviation of each state (including District of
Columbia in 2008)• PriceD: Closing price (predicted vote share) of Democratic
Nominee’s market• PriceR: Closing price (predicted vote share) of Republican
Nominee’s market• VolumeD: Total session trades of Democratic Party Nominee’s
market• VolumeR: Total session trades of Republican Party Nominee’s
market
Florian Hollenbach 2
![Page 4: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/4.jpg)
Variables in the pres08 data
• state.name: Full name of state (only in pres2008)
• state: Two letter state abbreviation
• Obama: Vote percentage for Obama
• McCain: Vote percentage for McCain
• EV: Number of electoral college votes for this state
Florian Hollenbach 3
![Page 5: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/5.jpg)
Combining data sets
• First we have to combine the different data sets
• To do so, we need an identifier that tells R which observationsto match to each other
• What could we use?
state variable
Florian Hollenbach 4
![Page 6: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/6.jpg)
Combining data sets
• First we have to combine the different data sets
• To do so, we need an identifier that tells R which observationsto match to each other
• What could we use?
state variable
Florian Hollenbach 4
![Page 7: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/7.jpg)
Combining data sets
• Use merge() function
merge(x,y, by =)intresults08 <- merge(intrade08, pres08, by = "state")head(intresults08)
Florian Hollenbach 5
![Page 8: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/8.jpg)
Question 1
Create a DaysToElection variable by subtracting the day of theelection from each day in the dataset. Now create a state margin ofvictory variable to predict, and a betting market margin to predictit with.
election day in 2008: Nov, 4th
Florian Hollenbach 6
![Page 9: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/9.jpg)
Solution 1
intresults08$DaysToElection<- as.Date("2008-11-04") - as.Date(intresults08$day)
intresults08$obama.intmarg <- intresults08$PriceD - intresults08$PriceRintresults08$obama.actmarg <- intresults08$Obama - intresults08$McCain
Florian Hollenbach 7
![Page 10: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/10.jpg)
Question 2
Considering only the trading one day from the election, predict theactual electoral margins from the trading margins using a linearmodel. Does it predict well? How would you visualize thepredictions and the outcomes together? Hint: because we onlyhave one predictor you can use abline.
Florian Hollenbach 8
![Page 11: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/11.jpg)
Solution 2
latest08 <- intresults08[intresults08$DaysToElection == 1,]int.fit08 <- lm(obama.actmarg ~ obama.intmarg, data = latest08)coef(int.fit08)summary(int.fit08)$r.squaredplot(latest08$obama.intmarg, latest08$obama.actmarg,
xlab="Market’s margin for Obama", ylab="Obama margin")abline(int.fit08)
Florian Hollenbach 9
![Page 12: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/12.jpg)
Question 3
What would be the prediction for the margin of victory if theInTrade margin was 25? Mark this point on the previous plot.
Florian Hollenbach 10
![Page 13: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/13.jpg)
Solution 3
coef(int.fit08)[1] + coef(int.fit08)[2]*25
plot(latest08$obama.intmarg, latest08$obama.actmarg,xlab="Market’s margin for Obama", ylab="Obama margin")
abline(int.fit08)points(25,(coef(int.fit08)[1] + coef(int.fit08)[2]*25), col = "red")
Florian Hollenbach 11
![Page 14: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/14.jpg)
Question 4
Even efficient markets aren’t omniscient. Information comes in about the electionevery day and the market prices should reflect any change in information that seemto matter to the outcome.
We can examine how and about what the markets change their minds by looking atwhich states they are confident about, and which they update their ‘opinions’ (i.e.their prices) about. Over the period before the election, let’s see how prices for eachstate are evolving. We can get a compact summary of price movement by fitting alinear model to Obama’s margin for each state over the 20 days before the election.
We will summarise price movement by the direction (up or down) and rate of change(large or small) of price over time. This is basically also what people in finance do,but they get paid more. . .
Start by plotting Obama’s margin in West Virginia against the number of days untilthe election and modeling the relationship with a linear model. Use the last 20 days.Show the model’s predictions on each day and the data. What does this model’sslope coefficient tells us about which direction the margin is changing and also howfast it is changing?
Florian Hollenbach 12
![Page 15: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/15.jpg)
Solution 4
stnames <- unique(intresults08$state.name)recent <- subset(intresults08, subset=(DaysToElection <= 20)& (state.name==stnames[1]))
recent.mod <- lm(obama.intmarg ~ DaysToElection, data=recent)plot(recent$DaysToElection, recent$obama.intmarg,xlab="Days to election", ylab="Market’s Obama margin")abline(recent.mod)
Florian Hollenbach 13
![Page 16: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/16.jpg)
Question 5
Let’s do the same thing for all states and collect the slopecoefficients (β’s). How can we modify the code from the answer tothe previous question? Then plot the distribution of changes for allstates.
Florian Hollenbach 14
![Page 17: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/17.jpg)
Solution 5
stnames <- unique(intresults08$state.name)change <- rep(NA, length(unique(intresults08$state.name)))names(change) <- unique(intresults08$state.name)
for(i in 1: length(unique(intresults08$state.name))){recent <- subset(intresults08, subset=(DaysToElection <= 20)& (state.name==stnames[i]))
recent.mod <- lm(obama.intmarg ~ DaysToElection, data=recent)change[i] <- coef(recent.mod)[2]}hist(change)
Florian Hollenbach 15
![Page 18: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/18.jpg)
Questin 5
Estimate a linear model using the intrade margin in the averageintrade margin in the week before the election to predict votemargin in 2008. How well does the model predict?
Florian Hollenbach 16
![Page 19: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/19.jpg)
Solution 5
latest08 <- intresults08[intresults08$DaysToElection <8,]average.Intrade <- tapply(latest08$obama.intmarg, latest08$state, mean)true.margin <- tapply(latest08$obama.actmarg, latest08$state, mean)
int.fit08 <- lm(true.margin ~ average.Intrade)coef(int.fit08)summary(int.fit08)$r.squared
Florian Hollenbach 17
![Page 20: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/20.jpg)
Question 6
Next, we read in the same data for the 2012 election. Use thelinear model created above to create predictions for the margin in2012. Calculate and plot the prediction error.
Florian Hollenbach 18
![Page 21: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/21.jpg)
Solution 6
data2012 <- read.csv("intresults12.csv")data2012$DaysToElection <- as.Date("2008-11-06") - as.Date(data2012$day)data2012$obama.intmarg <- data2012$PriceD - data2012$PriceRdata2012$obama.actmarg <- data2012$Obama - data2012$Romney
Florian Hollenbach 19
![Page 22: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/22.jpg)
Solution 6
latest12<- data2012[data2012$DaysToElection <8,]
average.Intrade12<- tapply(latest12$obama.intmarg, latest12$state, mean, na.rm = T)
true.margin12<- tapply(latest12$obama.actmarg, latest12$state, mean, na.rm = T)
prediction<- coef(int.fit08)[1] + coef(int.fit08)[2]*average.Intrade12
error <- true.margin12 - predictionhist(error)
Florian Hollenbach 20
![Page 23: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/23.jpg)
Linear Regression and RCTs
Can we estimate regression models on data from experiments?
Yes, treatment status as the independent variable (0 or 1)
Florian Hollenbach 21
![Page 24: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/24.jpg)
Linear Regression and RCTs
Can we estimate regression models on data from experiments?
Yes, treatment status as the independent variable (0 or 1)
Florian Hollenbach 21
![Page 25: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/25.jpg)
Linear Regression and RCTs
• y = α + β * treatment + ε
• What is the interpretation of α here?
• What is the interpretation of β?
Florian Hollenbach 22
![Page 26: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/26.jpg)
Linear Regression and RCTs
• y = α + β * treatment + ε
• What is the interpretation of α here?
• What is the interpretation of β?
Florian Hollenbach 22
![Page 27: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/27.jpg)
Linear Regression and RCTs
• y = α + β * treatment + ε
• β = average treatment effect
• The two predicted values are the average outcome under eachcondition
• β: Predicted change in Y caused by increase of T by 1
Remember, generally regression coefficents are not to beinterpreted as causal effects!
Florian Hollenbach 23
![Page 28: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/28.jpg)
Linear Regression and RCTs
• y = α + β * treatment + ε
• β = average treatment effect
• The two predicted values are the average outcome under eachcondition
• β: Predicted change in Y caused by increase of T by 1
Remember, generally regression coefficents are not to beinterpreted as causal effects!
Florian Hollenbach 23
![Page 29: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/29.jpg)
Linear Regression and RCTs
• y = α + β * treatment + ε
• β = average treatment effect
• The two predicted values are the average outcome under eachcondition
• β: Predicted change in Y caused by increase of T by 1
Remember, generally regression coefficents are not to beinterpreted as causal effects!
Florian Hollenbach 23
![Page 30: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/30.jpg)
Race and Job Applications
resume <- read.csv("resume.csv")head(resume)
firstname sex race call1 Allison female white 02 Kristen female white 03 Lakisha female black 04 Latonya female black 05 Carrie female white 06 Jay male white 0
• Randomized “race” in job applications• What is the effect of race on likelyhood of callback?
Marianne Bertrand and Sendhil Mullainathan (American Economic Review 2004)
Florian Hollenbach 24
![Page 31: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/31.jpg)
Race and Job Applications
mean(resume$call[resume$race == "black"])mean(resume$call[resume$race == "white"])mean(resume$call[resume$race == "black"]) - mean(resume$call[resume$race == "white"])
[1] 0.06447639
[1] 0.09650924
[1] -0.03203285
Florian Hollenbach 25
![Page 32: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/32.jpg)
Race and Job Applications
linear <- lm(call ~ race, data = resume)coef(linear)
(Intercept) racewhite0.06447639 0.03203285
R automatically turns the factor into a dummy (binary) variable
• α is the intercept, when X = 0 (i.e. race is “black”)• β is change in when X is set to 1 (i.e. race is “white”)
Florian Hollenbach 26
![Page 33: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/33.jpg)
Race and Job Applications
linear <- lm(call ~ race, data = resume)coef(linear)
(Intercept) racewhite0.06447639 0.03203285
R automatically turns the factor into a dummy (binary) variable
• α is the intercept, when X = 0 (i.e. race is “black”)• β is change in when X is set to 1 (i.e. race is “white”)
Florian Hollenbach 26
![Page 34: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/34.jpg)
Linear Regression with multiple independent variables
Y = α+ β1X1 + β2X2 + · · ·+ βpXp + ε
• principle of regression model stays the same
• we attempt to draw the best fitting line through a cloud ofpoints (now in multiple dimensions)
Florian Hollenbach 27
![Page 35: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/35.jpg)
Linear Regression with multiple independent variables
We still minimize the sum of the squared residuals:
SSR =n∑
i=1
ε2i
Florian Hollenbach 28
![Page 36: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/36.jpg)
Linear Regression with multiple independent variables
We still minimize the sum of the squared residuals:
SSR =n∑
i=1
ε2i =n∑
i=1
(Yi − Y )2
Florian Hollenbach 29
![Page 37: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/37.jpg)
Linear Regression with multiple independent variables
We still minimize the sum of the squared residuals:
SSR =n∑
i=1
ε2i =n∑
i=1
(Yi − Y )2
And thus:
SSR =∑n
i=1(Yi − (α+ β1Xi1 + β2Xi2 + · · ·+ βpXip))2
Florian Hollenbach 30
![Page 38: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/38.jpg)
Linear Regression with multiple independent variables
Interpretation:
• α: Intercept or y when all Xp = 0
• βp: Slope of predictor Xp
• βp: Predicted change in Y when Xp increases by 1 and allother predictors are held constant!
Florian Hollenbach 31
![Page 39: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/39.jpg)
Linear Regression with multiple independent variables
Interpretation:
• α: Intercept or y when all Xp = 0
• βp: Slope of predictor Xp
• βp: Predicted change in Y when Xp increases by 1 and allother predictors are held constant!
Florian Hollenbach 31
![Page 40: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/40.jpg)
Linear Regression with multiple independent variables
Interpretation:
• α: Intercept or y when all Xp = 0
• βp: Slope of predictor Xp
• βp: Predicted change in Y when Xp increases by 1 and allother predictors are held constant!
Florian Hollenbach 31
![Page 41: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/41.jpg)
Linear Regression with multiple independent variables
• βp: Predicted change in Y when Xp increases by 1 and allother predictors are held constant!
• we can use the multiple regression to control for confounders
• impact of each individual predictor when the other predictorsdo not change
• Example: Association between income and child mortalitywhen regime type is not changing
Florian Hollenbach 32
![Page 42: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/42.jpg)
Linear Regression with multiple independent variables
• βp: Predicted change in Y when Xp increases by 1 and allother predictors are held constant!
• we can use the multiple regression to control for confounders
• impact of each individual predictor when the other predictorsdo not change
• Example: Association between income and child mortalitywhen regime type is not changing
Florian Hollenbach 32
![Page 43: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/43.jpg)
Linear Regression with multiple independent variables in R
result <- lm(y ~ x1 + x2 + x3 + x4, data = data)coef(result)
Florian Hollenbach 33
![Page 44: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/44.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)bivar <- lm(Child.Mortality ~ log(GDP), data = data)coef(bivar)summary(bivar)$r.squared
Florian Hollenbach 34
![Page 45: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/45.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)bivar <- lm(Child.Mortality ~ log(GDP), data = data2010)coef(bivar)summary(bivar)$r.squared
(Intercept) log(GDP)276.58162 -26.12717
[1] 0.586953
Florian Hollenbach 35
![Page 46: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/46.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)multiple <- lm(Child.Mortality ~ log(GDP) + PolityIV, data = data2010)coef(multiple)summary(multiple)$r.squared
Florian Hollenbach 36
![Page 47: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/47.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)multiple <- lm(Child.Mortality ~ log(GDP) + PolityIV, data = data2010)coef(multiple)summary(multiple)$r.squared
(Intercept) log(GDP) PolityIV277.845620 -25.641789 -1.029062
[1] 0.6113747
Florian Hollenbach 37
![Page 48: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/48.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)multiple <- lm(Child.Mortality ~ log(GDP) + PolityIV, data = data2010)coef(multiple)coef(bivar)
(Intercept) log(GDP) PolityIV277.845620 -25.641789 -1.029062
(Intercept) log(GDP)276.58162 -26.12717
Florian Hollenbach 38
![Page 49: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/49.jpg)
Linear Regression with multiple independent variables in R
• In multiple regression models we want to adjust the goodnessof fit statistic by the number of variables included
• This is done via the degrees of freedom (DF) adjustment:
adjustedR2 = 1− SSR/(n − p − 1)TSS/(n − 1)
Florian Hollenbach 39
![Page 50: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/50.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)multiple <- lm(Child.Mortality ~ log(GDP) + PolityIV, data = data2010)coef(multiple)summary(multiple)$r.squaredsummary(multiple)$adj.r.squared
Florian Hollenbach 40
![Page 51: Political Science 209 - Fall 2018 - Linear Regressionfhollenbach.org/Polisci209_2018/slides/week8/week8.pdf · Political Science 209 - Fall 2018 Linear Regression Florian Hollenbach](https://reader034.vdocuments.site/reader034/viewer/2022050413/5f89ef16de94454ebc7a7b30/html5/thumbnails/51.jpg)
Linear Regression with multiple independent variables in R
data <- read.csv("bivariate_data.csv")data2010 <- subset(data, Year == 2010)multiple <- lm(Child.Mortality ~ log(GDP) + PolityIV, data = data2010)coef(multiple)summary(multiple)$r.squaredsummary(multiple)$adj.r.squared
(Intercept) log(GDP) PolityIV277.845620 -25.641789 -1.029062
[1] 0.6113747
[1] 0.6061582
Florian Hollenbach 41