statistics and quantitative analysis u4320 · 2001. 4. 25. · n say you add a number of variables...

45
Statistics and Quantitative Analysis U4320 Lecture 13: Explaining Variation Prof. Sharyn O’Halloran

Upload: others

Post on 20-Jul-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Statistics and Quantitative Analysis U4320

Lecture 13: Explaining VariationProf. Sharyn O’Halloran

Page 2: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Breaking Down the Distancesn Let's go back to the basics of regression

analysis.

How well does the predicted line explain the variation in the independent variable money spent?

45

6

7

8

10 20 30 40 50 60 70

x

x x

xx x

x

Income

Money Spent on Health Care

X

Y in thousands of $x

bxaY +=

Y

Page 3: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Total Variation

4

5

6

7

8

10 20 30 40 50 60 70

x

x x

x x x

x

Income

Money Spent on Health Care

X

Y in thousands of $

x

Y

bXaY +=ˆ

(x,y)

Y arounddeviation total=− YY

regressionby explaineddeviation ˆ =− YY

regressionby dunexplainedeviation ˆ=− YY

5.9

Page 4: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Equation for Total Deviation

n Definitionn The total distance from any point to is the

sum of the distance from Y to the regression line plus the distance from the regression line to .

.

Y

Y

)()ˆ( YYYYYY −+−=−

Deviation ed UnexplainDeviation ExplainedDeviation Total +=

Page 5: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Sums of Squaresn We can sum this equation across all the Y's and

square both sides to get:

( ) ( $) ( $ )

( $) ( $)( $ ) ( $ )

( $ ) ( $) ,

Y Y Y Y Y Y

Y Y Y Y Y Y Y Y

Y Y Y Y

− = − + −

= − + − − + −= − + −

∑ ∑∑ ∑ ∑∑ ∑

2 2

2 2

2 2

2

Total Sum of Squares= Regression Sum of Squares + Error Sum of Squares

=0

This term falls out because of our independence assumption

Page 6: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Total Sum of Squares (SST).n The term on the left-hand side of this equation

is the sum of the squared distances from all points to .

n We call this the total variation in the Y's, or the Total Sum of Squares, or SST.

n Regression Sum of Squares (SSR)n The first term on the right hand side is the sum

of the squared distances from the regression line to .

n We call this the Regression Sum of Squares, or SSR.

Y

Y

Page 7: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Error Sum of Squaresn Finally, the last term is the sum of the squared

distances from the points to the regression line.n Remember, this is the quantity that least

squares minimizes.n We call it the Error Sum of Squares, or SSE.

n We can rewrite the previous equation as:

SST = SSR + SSE

Total Deviation Explained Deviation Unexplained Deviation

Page 8: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Definition of R2

n We can use these new terms to determine how much variation is explained by the regression line.

n Variation refers to the amount of vertical deviation in the dependent variable Y.

n Multiple Coefficient of Determination:

n Measures how well the model fits the data.Variance Total

Variance Explained2 =R

Page 9: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Case 1:n If the points are perfectly linear, then the Error Sum of

Squares is 0:

n Here, SSR = SST. The variance in the Y's is completely explained by the regression line.

45

6

7

8

10 20 30 40 50 60 70

x

xx

xx

xx

Income

Money Spent on Health Care

X

Y in thousands of $

x bxaY +=

Y

Page 10: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Case 2:n If there is no relation between X and Y:

n Now SSR is 0 and SSE=SST. n The regression line explains none of the variance around

the mean Y-bar.

45

6

7

8

10 20 30 40 50 60 70

x

xx

x

x x

x

Income

Money Spent on Health Care

X

Y in thousands of $

x bxaY +=

Y

Page 11: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Formulan So we can construct a useful statistic.

n Take the ratio of the Regression Sum of Squares to the Total Sum of Squares:

n We call this statistic R2

n It represents the percent of the variation in Y explained by the regression

R SSRSST

2 =

Page 12: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Propertiesn R2 is always between 0 and 1.

n For a perfectly straight line it's 1, which is perfect correlation.

n For data with little relation, it's near 0. n R2 measures the explanatory power of a

model. n The more of the variance in Y you can explain,

the more powerful your model.

Page 13: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Examplen Why do people have confidence in what

they see on TV?n Dependent variable

n TRUSTTV = 1 if has a lot of confidence= 2 if somewhat confidence= 3 if the individual has no confidence.

n Independent variablesn TUBETIME = number of Hours of TV watched a weekn SKOOL = years of education. n LIKEJPAN = feelings towards Japan. n YELOWSTN = attitudes whether the US should spend

more on national parks.n MYSIGN = the respondents astrological sign.

Page 14: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Descriptive Statisticsn Correlation matrix

n Multicollinearityn Variables that are highly correlated it become difficult to

untangle their separate effects. n That is, the variables do not explain any additional

variance in the dependent variable.n Technically, the equation becomes indeterminate.

n Usually, if the correlation between two variables is above 0.6, we use only one of the indicators.

TRUSTTV TUBETIME SKOOL LIKEJPAN YELOWSTN MYSIGNTRUSTTV 1 -0.177 0.112 0.043 0.003 -0.038TUBETIME -0.177 1 -0.272 0.08 -0.137 0.053SKOOL 0.112 -0.272 1 -0.072 -0.016 0.012LIKEJPAN 0.043 0.08 -0.072 1 0.04 -0.001YELOWSTN 0.003 -0.137 -0.016 0.04 1 -0.02MYSIGN -0.038 0.053 0.012 -0.001 -0.02 1

Correlation Matrix

Page 15: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Estimated Model n TRUSTTV = 2.34 − 0.0539 (TUBETIME)

n How do we calculate the Total Sum of Squares?SST = SSR + SSESST = 5.91 + 183.61 = 189.51

n Calculate R2:

DF Sum of Squares Mean SquareRegression 1.00 5.91 5.91Residual 468.00 183.62 0.39

Analysis of Variance

SSTSSRR =2 031.

51.18991.52 ==R

Page 16: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Each of the four different models has an associated R2.n Model # 2: R2 = 0.035

n Model # 3: R2 = 0.039

n Model # 4: R2 = 0.04

035.51.189

74.62 ==R

039.51.189

45.72 ==R

040.51.189

70.72 ==R

Page 17: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: R2

n Use it Don’t Abuse itn Using R2 in Practice

n Useful Tooln Measure of Unexplained Variancen Not a Statistical Test

n Don't Obsess about R2

n In the Example…n You can always improve R2 by adding variables

n You'll notice that the R2 increases every time.n No matter what variables you add you can

always increase your R2.

Page 18: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: Adjusted R2(cont)

n Definition of Adjusted R2

n So we'd like a measure like R2, but one that takes into account the fact that adding extra variables always increases your explanatory power.

n The statistic we use for this is call the Adjusted R2, and its formula is:

n The Adjusted R2 can actually fall if the variable you add doesn't explain much of the variance.

R nn k R

nk

2 21 1 1= − −− −

==

( );

number of observations,number of independent variables.Includes constant

Page 19: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: Adjusted R2 (cont)

n Back to the Examplen Comparing Adjusted R2

n Model 1: .029n Model 2: .031n Model 3: .033n Model 4: .030

n Interpretationn You can see that the adjusted R2 rises from equation 1

to equation 2, and from equation 2 to equation 3.n But then it falls from equation 3 to 4, when we add in

the variables for national parks and the zodiac.

Page 20: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: Adjusted R2 (cont)

n Example: Equation 2

n We calculate:

Multiple R 0.18856R Square 0.03555Adjusted R Square 0.03142Standard Error 0.62562

DFSum of Squares Mean SquareRegression 2 6.73848 3.36924Residual 467 182.785 0.3914

F=8.60813 Signif F = 0.0002

Analysis of Variance

Variable B SE B Beta T T Sig TSKOOL 0 0.01064 0.068897 1.459 0.1453TUBETIME -0 0.01444 -0.157772 -3.341 0.0009(Constant) 2.1 0.15826 13.446 0

R nn k R2 21 1 1

1 470 1470 3 1 03555

0314

= − −− −

= − −− −

=

( )

( . )

.

Page 21: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Explaining Variation: Adjusted R2 (cont)

n Stepwise Regressionn One strategy for model building is to add variables

only if they increase your adjusted R2. n This technique is called stepwise regression. n However, I don't want to emphasize this approach

to strongly. n Just as people can fixate on R2 they can fixate on

adjusted R2. n If you have a theory that suggests that certain variables

are important for your analysis then include them whether or not they increase the adjusted R2.

n Negative findings can be important!

Page 22: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Testsn When to use an F-Test?

n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant in explaining variation in your dependent variable Y.

n The F-test tells you whether a group of variables, or even an entire model, is jointly significant. n This is in contrast to a t-test, which tells whether an

individual coefficient is significantly different from zero.n In short, does the specified model explain a significant

proportion of the total variation.

Page 23: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Equationsn To be precise, say our original equation is:

Model 1: Y = b0 + b1X1 + b2X2,We add two more variables, so the new equation is:

Model 2: Y = b0 + b1X1 + b2X2 + b3X3 + b4X4.n We want to test the hypothesis that

Η 0: β3 = β4 = 0.We want to test the joint hypothesis that X3and X4 together are not significant factors in determining Y.

Page 24: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Using Adjusted R2 Firstn There's an easy way to tell if these two

variables are not significant.n First, run the regression without X3 and X4 in it,

then run the regression with X3 and X4. n Now look at the adjusted R2's for the two

regressions. n If the adjusted R2 went down, then X3 and X4 are not

jointly significant.

n So the adjusted R2 can serve as a quick test for insignificance.

Page 25: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Calculating an F-TestIf the adjusted R2 goes up, then you need to do

a more complicated test, F-Test.n Ratio

n Let regression 1 be the model without X3 and X4, and let regression 2 include X3 and X4.

n The basic idea of the F statistic, then, is to compute the ratio:

SSE SSESSE1 2

2

Page 26: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Correctionn We have to correct for the number of independent we

add.n So the complete statistic is:

n Remember: k is the total number of independent variables, including the ones that you are testing and the constant.

FSSE SSE

mSSEn k

mk

=−

−==

1 2

2;

number of restrictions;number of independent variables.

m is the number of additional variables added to the model

Page 27: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Correction (cont.)

n This equation defines an F-statistic with m and n-k degrees of freedom.

n We write it like this:

n To get critical values for the F statistic, we use a set of tables, just like for the normal and t-statistics.

Fn km−

Page 28: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Examplen Adding Extra Variables: Are a group of

variables jointly significant?n Are the variables YELOWSTN and MYSIGN

jointly significant?

TUBETIMEb SKOOLb LIKEJPANb TRUSTTV :1 Model 3210 +++= b

YELOWSTNbMYSIGNbTUBETIMEb SKOOLb LIKEJPANb TRUSTTV :2 Model

54

3210

+++++= b

Page 29: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Adding Extra Variables (cont.)n State the null hypothesis

n Calculate the F-statistic n Our formula for the F-statistic is:

FSSE SSE

mSSEn k

=−

1 2

2,

0:540== BBH

Page 30: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n What is SSE1?n the sum of squared errors in the first regression.

n What is SSE2?n the sum of squared errors in the second

regressionm = 2 N = 470 k = 6

n The formula is:

F =−

182 07 181822

18182470 6

. .

.

= 0.319

Page 31: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Reject or fail to reject the null hypothesis?n The critical value at the 5 % level

n from the table, is 3.00.n Is the F-statistic > ?

n If yes, then we reject the null hypothesis that the variables are not significantly different from zero; otherwise we fail to reject.

n We can reject the null hypothesis because .319 < 3.00.

F470 62

F470 62

β=0 0.319 3.0

Page 32: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Testing All Variables: Is the Model Significant?n Equation 2: Impact of school and TV watched

Multiple R 0.18856R Square 0.0356Adjusted R Square 0.0314Standard Error 0.6256

DF Sum of SquaresMean SquareRegression 2 6.74 3.37Residual 467 182.78 0.39

F Statistic = 8.61 Signif F =2.000E-04

Variable B SE B Beta T Sig TSKOOL 0.016 0.011 0.069 1.459 0.145TUBETIME -0.048 0.014 -0.158 -3.341 0.001(Constant) 2.128 0.158 13.446 0.000

Dependent Variable: Trust TV

Page 33: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Hypothesis Testing:n State Hypothesis

n Calculate test statisticn Again, we start with our formula:

FSSE SSE

mSSEn k

=−

1 2

2,

0:210== ββH

Page 34: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Calculate F-statisticn SSE2 = 182.78n SSE1 is the sum of squared errors when there

are no explanatory variables at all.If there are no explanatory variables, then SSR must be 0. In this case, SSE=SST.

n So we can substitute SST for SSE1 in our formula. SST = SSR + SSE = 6.738 + 182.78 = 189.54

F =−

189 54 182 782

182 78470 3

. .

.

= 8.61.

This is the number reported in your printout under the F statistic.

Page 35: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: F-Tests (cont.)

n Reject or fail to reject the null hypothesis?n The critical value at the 5% level, from

your table, is 3.00.n So this time we can reject the null hypothesis

that β1 = β2 = 0.

n Interpretation?n The model explains a significant amount of the

total variation in how much people trust what is said on TV.

F470 32

Page 36: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Examplen Study of 78 Seventh Grade students in

a mid-western school.n Path Diagram

IQGPA

Gender

+

+

Page 37: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

n Variablesn IQ= student’s score on a standard IQ testn GPA= student’s grade point averagen Gender= students gender (1 for male; 0 for female)

n Descriptive Statistics

GPA IQ GenderMean 7.45 Mean 108.92 Mean 0.60Standard Error 0.24 Standard Error 1.49 Standard Error 0.06Mode 9.17 Mode 111.00 Mode 1.00Sample Variance 4.41 Sample Variance 173.47 Sample Variance 0.24Kurtosis 1.10 Kurtosis 0.64 Kurtosis -1.87Minimum 0.53 Minimum 72.00 Minimum 0.00Sum 580.83 Sum 8496.00 Sum 47.00

Page 38: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

Relation between IQ and GPA

0

2

4

6

8

10

12

0 50 100 150

IQ

GP

A

0

20

40

60

80

100

120

Average of GPA

Average of IQ

Average of GPA 7.696548387 7.281638298

Average of IQ 105.8387097 110.9574468

0 1

Gender

Data

Gender Average of GPA Average of IQ0 7.696548387 105.83870971 7.281638298 110.9574468Grand Total 7.446538462 108.9230769

Relation between IQ and GPAFor Women Only

0

2

4

6

8

10

12

0 20 40 60 80 100 120 140

IQ

GP

A

Relation between IQ and GPAFor Men Only

0

2

4

6

8

10

12

0 20 40 60 80 100 120 140 160

IQ

GP

A

Graphs

Page 39: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

n Hypothesis Testing:n Hypothesizes concerning coefficients

n

n We want to know if IQ and Gender explain a significant amount of the variation in GPA.

n Hypothesizes Concerning Models0:

210== ββH

0:10=βH

0:1≠β

aH

0:21≠= ββ

aH

Page 40: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

n Estimationn Model I

IQbbGPA10

+=

SUMMARY OUTPUT Dependent Variable: GPA

Regress ion Stat is t icsMultiple R 0.63R Square 0.40Adjusted R Square 0.39Standard Error 1.63Observations 78.00

ANOVAdf S S MS F Sign i f icance F

Regression 1 136.32 136.32 51.01 4.7373E-10Residual 76 203.11 2.67Total 77 339.43

Coeff ic ients Standard Error t Stat P-valueIntercept -3.56 1.552 -2.29 0.024658962IQ 0.10 0.014 7.14 4.7373E-10

IQGPA 10.056.3 +−=

Relation between IQ and GPA

y = 0.101x - 3.5571

-6

-4

-20

2

4

68

10

12

-20 0 20 40 60 80 100 120 140 160

GPA

IQ

Page 41: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

GenderbIQbbGPA210

++=

SUMMARY OUTPUT Dependent Variable: GPA

Regress ion Stat is t icsMultiple R 0.67R Square 0.45Adjusted R Square 0.44Standard Error 1.58Observations 78.00

ANOVAdf S S MS F S igni f icance F

Regression 2 153.16 76.58 30.84 0.00Residual 75 186.27 2.48Total 77 339.43

Coeff ic ients Standard Error t Stat P-valueIntercept -3.73 1.50 -2.49 0.01IQ 0.11 0.01 7.77 0.00Gender -0.97 0.37 -2.60 0.01

GenderIQGPA 11.097.073.3 +−−=

n Model II:

Relation between IQ and GPA

-6

-4

-2

0

2

4

6

8

10

12

-20 0 20 40 60 80 100 120 140 160

IQ

GP

A

GenderIQGPA 11.097.073.3 +−−=IQGPA 0.101 3.5571 - = +

Page 42: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

n Is Model I better than Model II?

FSSE SSE

mSSEn k

=−

1 2

2, 124.6

75.284.16

37711.203

127.18611.203

==

Yes it is.

124.600.41

78<=F

F-test Statistics

β=0 4.0 6.12

Page 43: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

Regression StatisticsMultiple R 0.675R Square 0.456Adjusted R Square 0.433Standard Error 1.585Observations 77.000

ANOVAdf SS MS F Significance F

Regression 3 153.526 51.175 20.363 0.000Residual 73 183.456 2.513Total 76 336.982

Coefficients Standard Error t Stat P-valueIntercept -2.196 2.169 -1.013 0.315IQ 0.093 0.020 4.583 0.000Gender -3.899 3.055 -1.276 0.206IQ*Gender 0.027 0.028 0.974 0.333

n Interactive Terms

The interactive term is not statistically significant. A high or low IQ has the same effect on GPA independent of gender.

Page 44: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Comparing Models: Example(cont.)

n Interpretationn Coefficients

n Both IQ and Gender matter. n IQ increases GPA by .11 points holding Gender

constant.n Gender Decreases GPA by .97 points holding IQ

constant.n Models

n F-statistic shows that the model that includes Gender performs significantly better in explaining variation then does the model with only IQ.

n We are therefore able to reject the null hypothesis that model 1=model 2 at the 5% significance level.

Page 45: Statistics and Quantitative Analysis U4320 · 2001. 4. 25. · n Say you add a number of variables into a regression model and you want to see if, as a group, they are significant

Final Papern Clearly state your hypothesis.

• Use a path diagram to present the causal relation.• Use the correlations to help you determine what causes what.• State the alternative hypothesis.

n Present descriptive statistics.n This includes a correlation matrix and histogram or scatter plot.

n Estimate your model.n You can do simple regression, include interactive terms, do path analysis, use dummy

variables; whatever is appropriate to your hypothesis.

n Present your results.n Interpret your results.n Draw out the policy implications of your

analysis.n The paper should begin with a brief which states the basic

project and your main findings.