count models 1 sociology 8811 lecture 12 copyright © 2007 by evan schofer do not copy or distribute...

27
Count Models 1 Sociology 8811 Lecture 12 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission

Upload: matthew-price

Post on 02-Jan-2016

225 views

Category:

Documents


4 download

TRANSCRIPT

Count Models 1

Sociology 8811 Lecture 12

Copyright © 2007 by Evan SchoferDo not copy or distribute without permission

Count Variables

• Many dependent variables are counts: Non-negative integers

• # Crimes a person has committed in lifetime• # Children living in a household• # new companies founded in a year (in an industry)• # of social protests per month in a city

– Can you think of others?

Count Variables

• Count variables can be modeled with OLS regression… but:– 1. Linear models can yield negative predicted

values… whereas counts are never negative• Similar to the problem of the Linear Probability Model

– 2. Count variables are often highly skewed• Ex: # crimes committed this year… most people are

zero or very low; a few people are very high• Extreme skew violates the normality assumption of

OLS regression.

Count Models

• Two most common count models:• Poisson Regression Model• Negative Binomial Regression Model

• Both based on the Poisson distribution:• = expected count (and variance)

– Called lambda () in some texts; I rely on Freese & Long 2006

• y = observed count

!y

eyP

y

Poisson Regression

• Strategy: Model log of as a function of Xs• Quite similar to modeling log odds in logit• Again, the log form avoids negative values

K

jjijX

1

ln

• Which can be written as:

K

jjijX

e 1

Poisson Regression: Example• Hours per week spent on web

0.0

5.1

.15

.2D

en

sity

0 10 20 30 40 50www hours per week

Poisson Regression: Web Use• Output = similar to logistic regression. poisson wwwhr male age educ lowincome babies

Poisson regression Number of obs = 1552 LR chi2(5) = 525.66 Prob > chi2 = 0.0000Log likelihood = -8598.488 Pseudo R2 = 0.0297

------------------------------------------------------------------------------ wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- male | .3595968 .0210578 17.08 0.000 .3183242 .4008694 age | -.0097401 .0007891 -12.34 0.000 -.0112867 -.0081934 educ | .0205217 .004046 5.07 0.000 .0125917 .0284516 lowincome | -.1168778 .0236503 -4.94 0.000 -.1632316 -.0705241 babies | -.1436266 .0224814 -6.39 0.000 -.1876892 -.0995639 _cons | 1.806489 .0641575 28.16 0.000 1.680743 1.932236------------------------------------------------------------------------------

Men spend more time on the web than women

Number of young children in household reduces web use

Poisson Regression: Stata Output

• Stata output yields familiar statistics:– Standard errors, z/t- values, and p-values for

coefficient hypothesis tests– Pseudo R-square for model fit

• Not a great measure… but gives a crude explained variance

– MLE log likelihood– Likelihood ratio test: Chi-square and p-value

• Comparing to null model (constant only)• Tests can also be conducted on nested models with

stata command “lrtest”.

Interpreting Coefficients

• In Poisson Regression, Y is typically conceptualized as a rate…

• Positive coefficients indicate higher rate; negative = lower rate

• Like logit, Poisson models are non-linear• Coefficients don’t have a simple linear interpretation

• Like logit, model has a log form; exponentiation aids interpretation

• Exponentiated coefficients are multiplicative• Analogous to odds ratios… but called “incidence rate

ratios”.

Interpreting Coefficients

• Exponentiated coefficients: indicate effect of unit change of X on rate

• In STATA: “incidence rate ratios”: “poison … , irr”• eb= 2.0 indicates that the rate doubles for each unit

change in X• eb= .5 indicates that the rate drops by half for each unit

change in X

• Recall: Exponentiated coefs are multiplicative• If eb= 5.0, a 2-point change in X isn’t 10; it is 5 * 5 = 25

– Also: you must invert to see opposite effects• If eb= 5.0, a 1-point decrease in X isn’t -5, it is 1/5 = .2

Interpreting Coefficients

• Again, exponentiated coefficients (rate ratios) can be converted to % change

• Formula: (eb - 1) * 100%• Ex: (e.5 - 1) * 100% = 50% decrease in rate.

Interpreting Coefficients• Exponentiated coefficients yield multiplier:. poisson wwwhr male age educ lowincome babies

Poisson regression Number of obs = 1552 LR chi2(5) = 525.66 Prob > chi2 = 0.0000Log likelihood = -8598.488 Pseudo R2 = 0.0297

------------------------------------------------------------------------------ wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- male | .3595968 .0210578 17.08 0.000 .3183242 .4008694 age | -.0097401 .0007891 -12.34 0.000 -.0112867 -.0081934 educ | .0205217 .004046 5.07 0.000 .0125917 .0284516 lowincome | -.1168778 .0236503 -4.94 0.000 -.1632316 -.0705241 babies | -.1436266 .0224814 -6.39 0.000 -.1876892 -.0995639 _cons | 1.806489 .0641575 28.16 0.000 1.680743 1.932236------------------------------------------------------------------------------

Exponentiation of .359 = 1.43; Rate is 1.43 times higher for men

(1.43-1) * 100 = 43% more

Exp(-.14) = .87. Each baby reduces rate by factor of .87

(.87-1) * 100 = 13% less

Predicted Counts

• Stata “predict varname, n” computes predicted value for each case

. predict predwww if e(sample), n

. list wwwhr predwww if e(sample)

+------------------+ | wwwhr predwww | |------------------| 1. | 1 5.659943 | 2. | 3 7.090338 | 3. | 2 5.281404 | 12. | 5 6.09473 | 13. | 4 6.968055 | 15. | 3 5.815624 | 16. | 0 5.539187 | 19. | 0 7.207257 | 20. | 8 8.03906 | 21. | 5 4.400002 | 23. | 1 6.77004 | 24. | 1 4.806245 | 25. | 8 5.710855 | 27. | 12 3.687142 | 33. | 40 4.997193 |

Some of the predictions are close to the observed values…

Many of the predictions are quite bad…

Recall that the model fit was VERY poor!

Predicted Probabilities

• Stata extension “prcount” can compute probabilities for each possible count outcome

• For all cases, of for particular groups• It plugs values (m), Xs, & bs into formula:

!

|m

XeXmP

mX

Rate: 5.7446 [ 5.6238, 5.8655] Pr(y=0|x): 0.0032 [ 0.0028, 0.0036] Pr(y=1|x): 0.0184 [ 0.0165, 0.0202] Pr(y=2|x): 0.0528 [ 0.0486, 0.0570] Pr(y=3|x): 0.1011 [ 0.0953, 0.1069] Pr(y=4|x): 0.1452 [ 0.1399, 0.1505] Pr(y=5|x): 0.1668 [ 0.1642, 0.1694] Pr(y=6|x): 0.1597 [ 0.1589, 0.1606] Pr(y=7|x): 0.1311 [ 0.1276, 0.1345] Pr(y=8|x): 0.0941 [ 0.0897, 0.0986] Pr(y=9|x): 0.0601 [ 0.0560, 0.0642]

male age educ lowincome babiesx= .4503866 40.992912 14.345361 .7371134 .20296392

Issue: Exposure

• Poisson outcome variables are typically conceptualized as rates

• Web hours per week• Number of crimes committed in past year

• Issue: Cases may vary in exposure to “risk” of a given outcome

• To properly model rates, we must account for the fact that some cases have greater exposure than others

• Ex: # crimes committed in lifetime– Older people have greater opportunity to have higher counts

• Alternately, exposure may vary due to research design– Ex: Some cases followed for longer time than others…

Issue: Exposure

• Poisson (and other count models) can address varying exposure:

K

jijij tX

ii et 1)ln(

• Where ti = exposure time for case i

• It is easy to incorporate into stata, too:• Ex: poisson NumCrimes SES income, exposure(age)• Note: Also works with other “count” models.

Poisson Model Assumptions

• Poisson regression makes a big assumption: That variance of = (“equidisperson”)

• In other words, the mean and variance are the same• This assumption is often not met in real data• Dispersion is often greater than : overdispersion

– Consequence of overdispersion: Standard errors will be underestimated

• Potential for overconfidence in results; rejecting H0 when you shouldn’t!

• Note: overdispersion doesn’t necessarily affect predicted counts (compared to alternative models).

Poisson Model Assumptions

• Overdispersion is most often caused by highly skewed dependent variables – Often due to variables with high numbers of zeros

• Ex: Number of traffic tickets per year• Most people have zero, some can have 50!• Mean of variable is low, but SD is high

– Other examples of skewed outcomes• # of scholarly publications• # cigarettes smoked per day• # riots per year (for sample of cities in US).

Negative Binomial Regression

• Strategy: Modify the Poisson model to address overdispersion

• Add an “error” term to the basic model:

• Additional model assumptions:• Expected value of exponentiated error = 1 (e = 1)• Exponentiated error is Gamma distributed• We hope that these assumptions are more plausible

than the equidispersion assumption!

K

jijijX

e 1

Negative Binomial Regression

• Full negative biniomial model:

y

y

yXyP

11

1

1

11

!|

• Note that the model incorporates a new parameter:

• Alpha represents the extent of overdispersion• If = 0 the model reduces to simple poisson regression

Negative Binomial Regression

• Question: Is alpha () = 0?• If so, we can use Poisson regression• If not, overdispersion is present; Poisson is inadequate

• Strategy: conduct a statistical test of the hypothesis: H0: = 0; H1: > 0

• Stata provides this information when you run a negative binomial model:

• Likelihood ratio test (G2) for alpha• P-value < .05 indicates that overdispersion is present;

negative binomial is preferred• If P>.05, just use Poisson regression

– So you don’t have to make assumptions about gamma dist….

Negative Binomial Regression

• Interpreting coefficients: Identical to poisson regression

• Predicted probabilities: Can be done. You must use big Neg Binomial formula

• Plugging in observed Xs, estimates of a, Bs…

y

y

yXyP

ˆ

ˆ

ˆ!|ˆ

11

1

1

11

• Probably best to get STATA to do this one…• Long & Freese created command: prvalue

Negative Binomial Example: Web Use• Note: Bs are similar but SEs change a lot!Negative binomial regression Number of obs = 1552 LR chi2(5) = 57.80 Prob > chi2 = 0.0000Log likelihood = -4368.6846 Pseudo R2 = 0.0066

------------------------------------------------------------------------------ wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- male | .3617049 .0634391 5.70 0.000 .2373666 .4860433 age | -.0109788 .0024167 -4.54 0.000 -.0157155 -.006242 educ | .0171875 .0120853 1.42 0.155 -.0064992 .0408742 lowincome | -.0916297 .0724074 -1.27 0.206 -.2335457 .0502862 babies | -.1238295 .0624742 -1.98 0.047 -.2462767 -.0013824 _cons | 1.881168 .1966654 9.57 0.000 1.495711 2.266625-------------+---------------------------------------------------------------- /lnalpha | .2979718 .0408267 .217953 .3779907-------------+---------------------------------------------------------------- alpha | 1.347124 .0549986 1.243529 1.459349------------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 8459.61 Prob>=chibar2 = 0.000

Note: Standard Error for education increased from .004 to .012! Effect is no longer statistically significant.

Negative Binomial Example: Web Use• Note: Info on overdispersion is providedNegative binomial regression Number of obs = 1552 LR chi2(5) = 57.80 Prob > chi2 = 0.0000Log likelihood = -4368.6846 Pseudo R2 = 0.0066

------------------------------------------------------------------------------ wwwhr | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- male | .3617049 .0634391 5.70 0.000 .2373666 .4860433 age | -.0109788 .0024167 -4.54 0.000 -.0157155 -.006242 educ | .0171875 .0120853 1.42 0.155 -.0064992 .0408742 lowincome | -.0916297 .0724074 -1.27 0.206 -.2335457 .0502862 babies | -.1238295 .0624742 -1.98 0.047 -.2462767 -.0013824 _cons | 1.881168 .1966654 9.57 0.000 1.495711 2.266625-------------+---------------------------------------------------------------- /lnalpha | .2979718 .0408267 .217953 .3779907-------------+---------------------------------------------------------------- alpha | 1.347124 .0549986 1.243529 1.459349------------------------------------------------------------------------------Likelihood-ratio test of alpha=0: chibar2(01) = 8459.61 Prob>=chibar2 = 0.000

Alpha is clearly > 0! Overdispersion is evident; LR test p<.05

You should not use Poisson Regression in this case

General Remarks

• Poisson & Negative binomial models suffer all the same basic issues as “normal” regression

• Model specification / omitted variable bias• Multicollinearity• Outliers/influential cases

– Also, it uses Maximum Likelihood• N > 500 = fine; N < 100 can be worrisome

– Results aren’t necessarily wrong if N<100; – But it is a possibility; and hard to know when problems crop up

• Plus ~10 cases per independent variable.

General Remarks

• It is often useful to try both Poisson and Negative Binomial models

• The latter allows you to test for overdispersion• Use LRtest on alpha () to guide model choice

– If you don’t suspect dispersion and alpha appears to be zero, use Poission Regression

• It makes fewer assumptions– Such as gamma-distributed error.

Example: Labor MilitancyIsaac & Christiansen 2002

Note: Results are presented as % change