poli30 session14 2008
TRANSCRIPT
INTERPRETING REGRESSION
COEFFICIENTS
OUTLINE
1. Back to Basics
2. Form: The Regression Equation
3. Strength: PRE and r2
4. The Correlation Coefficient r
5. Significance: Looking Ahead
6. Example 1: Democracy in Latin America
7. Example 2: Wine Consumption and Heart Disease
BACK TO BASIC CONCEPTS
PRE = (E1 – E2)/E1 = 1 – E2/E1
E1 = Σ(Y – Y)2
Rule for “predicting” values of Y, given knowledge of X:
Yhati = a + bXi
E2 = Σ (Yi – Ŷ)2
that is, sum of squared differences between observed values of Y and predicted values of Y (values of Y as “predicted” by the regression equation)
Thus the elements of PRE.
STRENGTH OF ASSOCIATION
Symbol = r2 = PRE = (E1 – E2)/E1
= (total variance – unexplained variance)/total variance
Varies from 0 to 1
Some back-of-the-envelope thresholds:
0.10, 0.30, 0.50+
FOCUSING ON FORM
As given by equation Ŷi = a + bXi
Constant a = intercept = predicted value of Y when X = 0
Coefficient b = slope = average change in Y for change in X
• Magnitude (large or small)
• Sign (positive or negative)
• Key to much interpretation
Linear Regression Equation
THE CORRELATION COEFFICIENT
Symbol = r
Summary statement of form (from sign) and indirect statement of strength
r = square root of r2, varies from –1 to +1
subject to over-interpretation
useful for preliminary assessment of association
Symmetrical no matter which variable is X andwhich is Y (note: slope b is not symmetrical)
ON THE CORRELATION COEFFICIENT r
Analogous to slope b (with removal of intercept a)
The “standardized regression coefficient,” or beta weight:
β= b (stand.dev. X/stand.dev. Y)
employs slope, values, and dispersion of variables
thus a “standardized” slope
Question: How much action on Y do you get from X?
In bivariate (or “simple”) regression, β = r
LOOKING AHEAD: MEASURING SIGNIFICANCE
1. Testing the null hypothesis:
F = r2(n-2)/(1-r2)
2. Standard errors and confidence intervals:
Dependent on desired significance level
Bands around the regression line
95% confidence interval ±1.96 x SE
Figure 1. Cycles of Political Change in Latin America, 1900-2000
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
1900 1905 1910 1915 1920 1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 1985 1990 1995 2000
Year
Nu
mb
er Semi-Democracy
Oligarchy
Democracy
Coefficients for Regression of N Electoral Democracies (Y)on Change Over Time (X):
a = -1.427
b = +.126
r = + .883
r2 = .780, Adjusted r2 = .777
Standard error of slope = .0067
95% confidence interval for slope = (.0067)x1.96 = ± .0013setting confidence bands at .113 and .140
F for equation = 350.91, p < 0.000
Scatterplot: N Democracies by Year
• N democracies = - 1.427 + .126 year• intercept = nonsense, but allows calculation of
year that predicted value of Y would be zero, in this case 1910
• slope = +.126 so, one additional democracyevery eight years
• and by 2000, total 11-12 democracies• PRE = .777
Interpreting the Equation
Example 2: Wine and Heart Disease
Data in Lectures 5-6
X = per capita annual consumption of alcohol from wine, in litersY = deaths from heart disease, per 100,000 people
Equation:
Ŷ = 260.6 - 22.97 X
r = - 0.843
What’s the interpretation?