stata – be the master stata. “after i have run my standard commands, what can i do to make my...

17
Stata – be the master Stat a

Upload: allyson-charles

Post on 20-Jan-2018

231 views

Category:

Documents


0 download

DESCRIPTION

Using dummies with interval variables can help improve fit -Create two extra dummies: one for here and one for here -Or (typically when you have a lot of data points): create dummies per group

TRANSCRIPT

Stata – be the master

Stata

“After I have run my standard commands, what can I do to make my model better (and understand better what is going on)?”

Using dummies with interval variables can help improve fit

- Create two extra dummies: one for here and one for here- Or (typically when you have a lot of data points):

create dummies per group

Variables need not be normally distributed … but it is often nice if they are

(and gladder price will give you a graphical representation as well)

interact.ado• A command to generate interaction effects• Centralizes automatically for interval variables (and that’s

important)

interact var1 var2, gen(var1_X_var2)

Installation:+ Download diagfiles.zip online+ Put files in some folder+ Add that folder to adopath (adopath + “/folderpath”)(+ Add this adopath statement to “profile.do”)

Interpreting interactions:when you have interactions,

“there are no main effects any more”

Potential transformations - fracpoly

… and there are several options, for instance to decide on the space of searched transformations

fracplot shows the estimated shape

Finding outliers - diag2.ado

(but only possible after regress, and you have to keep thinking yourself!)

The better way to find outliers in logit: ldfbeta(“findit ldfbeta”)

Note:Actually notcompletely Correct.

Better (but moretedious), is to standardize theX-variables first.

Other possibilities …

• Try to find a subset of your data for which your model works better / differently (typically easier when you know something about the topic substantially)

• Consider sequences of models, instead of focusing on “the best model”:

Sequences of models(easiest when you do not have that many variables)

Handy bits of coding

global VARS var1 var2 var3 …reg y $VARS

forvalues i = 1/10 {gen var`i’ = (varindata == `i’)

}

Granddad talking:

More buttons getrid of determination …

zebra

squeeze, but be honest

To Do

• Back to your logistic regression assignment.

• Compare what others have done with the dataset that you had.

• Improve, squeeze, and deliver one assignment (make that a do-file) per data set