empirical methods for microeconomic applications university of lugano, switzerland may 27-31, 2013

68
Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013 William Greene Department of Economics Stern School of Business

Upload: tiara

Post on 25-Feb-2016

90 views

Category:

Documents


1 download

DESCRIPTION

Empirical Methods for Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013. William Greene Department of Economics Stern School of Business. 2B. Heterogeneity: Latent Class and Mixed Models. Agenda for 2B. Latent Class and Finite Mixtures Random Parameters - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Empirical Methods for Microeconomic Applications

University of Lugano, SwitzerlandMay 27-31, 2013

William GreeneDepartment of EconomicsStern School of Business

Page 2: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

2B. Heterogeneity: Latent Class and Mixed Models

Page 3: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 4: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Agenda for 2B• Latent Class and Finite

Mixtures• Random Parameters• Multilevel Models

Page 5: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Latent Classes• A population contains a mixture of

individuals of different types (classes)• Common form of the data generating

mechanism within the classes• Observed outcome y is governed by the

common process F(y|x,j )• Classes are distinguished by the

parameters, j.

Page 6: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Density? Note significant mass below zero. Not a gamma or lognormal or any other familiar density.

How Finite Mixture Models Work

Page 7: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

ˆ

1 y - 7.05737 1 y - 3.25966F(y) =.28547 +.714533.79628 3.79628 1.81941 1.81941

Find the ‘Best’ Fitting Mixture of Two Normal Densities

1000

2 i jji=1 j=1

j j

y -μ1 LogL = log πσ σ

Maximum Likelihood Estimates Class 1 Class 2 Estimate Std. Error Estimate Std. errorμ 7.05737 .77151 3.25966 .09824σ 3.79628 .25395 1.81941 .10858π .28547 .05953 .71453 .05953

Page 8: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Mixing probabilities .715 and .285

Page 9: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Approximation

Actual Distribution

Page 10: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

A Practical Distinction• Finite Mixture (Discrete Mixture):

• Functional form strategy• Component densities have no meaning • Mixing probabilities have no meaning• There is no question of “class membership”• The number of classes is uninteresting – enough to get a good fit

• Latent Class:• Mixture of subpopulations• Component densities are believed to be definable “groups”

(Low Users and High Users in Bago d’Uva and Jones application)• The classification problem is interesting – who is in which class?• Posterior probabilities, P(class|y,x) have meaning• Question of the number of classes has content in the context of

the analysis

Page 11: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

The Latent Class Model

it it

(1) There are Q classes, unobservable to the analyst(2) Class specific model: f(y | ,class q) g(y , )(3) Conditional class probabilities Common multinomial logit form for prior class pr

it it qx ,x β

iq Qq 1

q q Q

obabilitiesexp(δ ) P(class=q| ) , δ = 0

exp(δ ) = log( / ).

qQ

q

δ

Page 12: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Log Likelihood for an LC Model

i

x x β

X ,β x βi

i

i,t i,t it i,t q

iT

i1 i2 i,T q it i,t qt 1

i

Conditional density for each observation is P(y | ,class q) f(y | , )Joint conditional density for T observations isf(y ,y ,...,y | ) f(y | , )(T may be 1. This is not

iX x βi

i

TQi1 i2 i,T q it i,t qq 1 t 1

only a 'panel data' model.)Maximize this for each class if the classes are known. They aren't. Unconditional density for individual i isf(y ,y ,...,y | ) f(y | , )Log Likelihoo

1β β x βiTN Q

Q 1 Q q it i,t qi 1 q 1 t 1

dLogL( ,..., ,δ ,...,δ ) log f(y | , )

Page 13: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Estimating Which Class

i

i

q

iT

i1 i2 i,T i it i,tt 1

Prob[class=q]= for T observations is

P(y ,y ,...,y | ,class q) f(y | , ) membership is the product

q

Prior class probability Joint conditional density

X x βJoint density for data and class

i

i

ii

Ti1 i2 i,T i q it i,tt 1

ii1 i2 i,T i

i1 i2 i,T i

P(y ,y ,...,y ,class q| ) f(y | , )

P( ,class q| )P(class q| y ,y ,...,y , ) P(y ,y ,...,y | )

q

i

X x βPosterior probability for class, given the data

y XX X

i

i

iQ

iq 1

Tq it i,tt 1

i i TQq it i,tq 1 t 1

P( ,class q| ) P( ,class q| )

Use Bayes Theorem to compute the f(y | , )w(q| , ) P(class j | , )

f(y | , )

i

i

qi i

q

y Xy X

posterior (conditional) probabilityx β

y X y Xx β

iq wBest guess = the class with the largest posterior probability.

Page 14: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

‘Estimating’ βi

Qiq=1

Qiqq=1

ˆ(1) Use from the class with the largest estimated probability(2) Probabilistic - in the same spirit as the 'posterior mean'

ˆ ˆ = Posterior Prob[class=q|data]ˆˆ = w

Note:

j

i q

q

β

β β

β This estimates E[ | ], not itself.i i i iβ y ,X β

Page 15: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

How Many Classes?(1) Q is not a 'parameter' - can't 'estimate' Q with and (2) Can't 'test' down or 'up' to Q by comparing log likelihoods. Degrees of freedom for Q+1 vs. Q classes is not well define

β

d.(3) Use AKAIKE IC; AIC = -2 logL + 2#Parameters.

Page 16: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

The Extended Latent Class Model

it

Class probabilities relate to observable variables (usuallydemographic factors such as age and sex).(1) There are Q classes, unobservable to the analyst(2) Class specific model: f(y | ,class q) g(itx

it q

qiq qQ

qq 1

y , )(3) Conditional class probabilities given some information, ) Common multinomial logit form for prior class probabilities

exp( ) P(class=q| , ) , = exp( )

it

i

ii

i

,x βz

zδz δ δ 0zδ

Page 17: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Unfortunately, this argument is incorrect.

Page 18: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Zero Inflation?

Page 19: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Zero Inflation – ZIP Models• Two regimes: (Recreation site visits)

• Zero (with probability 1). (Never visit site)• Poisson with Pr(0) = exp[- ’xi]. (Number of visits,

including zero visits this season.)• Unconditional:

• Pr[0] = P(regime 0) + P(regime 1)*Pr[0|regime 1]• Pr[j | j >0] = P(regime 1)*Pr[j|regime 1]

• This is a “latent class model”

Page 20: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 21: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 22: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

A Latent Class Hurdle NB2 Model• Analysis of ECHP panel data (1994-2001)• Two class Latent Class Model

• Typical in health economics applications• Hurdle model for physician visits

• Poisson hurdle for participation and negative binomial intensity given participation

• Contrast to a negative binomial model

Page 23: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 24: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

LC Poisson Regression for Doctor Visits

Page 25: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Heckman and Singer’s RE Model• Random Effects Model• Random Constants with Discrete Distribution

it it q

q

(1) There are Q classes, unobservable to the analyst(2) Class specific model: f(y | ,class q) g(y , )(3) Conditional class probabilities Common multinomial logit form for prior clas

it itx ,x ,β

Qqq=1

qq QJ

qj 1

s probabilities to constrain all probabilities to (0,1) and ensure 1; multinomial logit form for class probabilities;

exp( ) P(class=q| ) , = 0exp( )

δ

Page 26: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

3 Class Heckman-Singer Form

Page 27: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Modeling Obesity with a Latent Class Model

Mark HarrisDepartment of Economics, Curtin University

Bruce HollingsworthDepartment of Economics, Lancaster University

William GreeneStern School of Business, New York University

Pushkar MaitraDepartment of Economics, Monash University

Page 28: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Two Latent Classes: Approximately Half of European Individuals

Page 29: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

An Ordered Probit ApproachA Latent Regression Model for “True BMI”

BMI* = ′x + , ~ N[0,σ2], σ2 = 1 “True BMI” = a proxy for weight is unobservedObservation Mechanism for Weight Type

WT = 0 if BMI* < 0 Normal 1 if 0 < BMI* < Overweight 2 if < BMI* Obese

Page 30: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Latent Class Modeling• Several ‘types’ or ‘classes. Obesity be due to

genetic reasons (the FTO gene) or lifestyle factors• Distinct sets of individuals may have differing

reactions to various policy tools and/or characteristics

• The observer does not know from the data which class an individual is in.

• Suggests a latent class approach for health outcomes(Deb and Trivedi, 2002, and Bago d’Uva, 2005)

Page 31: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Latent Class Application• Two class model (considering FTO gene):

• More classes make class interpretations much more difficult

• Parametric models proliferate parameters• Endogenous class membership: Two classes allow us to

correlate the equations driving class membership and observed weight outcomes via unobservables.

• Theory for more than two classes not yet developed.

Page 32: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Endogeneity of Class Membership

,

,

Class Membership: C* = , C = 1[C* > 0] (Probit)

BMI|Class=0,1 BMI* = , BMI group = OP[BMI*, ( )]

10Endogeneity: ~ ,

10

Bivaria

z

x w

i i

c i c i c i

i c

c i c

u

uN

te Ordered Probit (one variable is binary).

Full information maximum likelihood.

Page 33: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Outcome Probabilities• Class 0 dominated by normal and overweight probabilities ‘normal weight’ class• Class 1 dominated by probabilities at top end of the scale ‘non-normal weight’• Unobservables for weight class membership, negatively correlated with those

determining weight levels:

Page 34: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Classification (Latent Probit) Model

Page 35: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Inflated Responses in Self-Assessed Health

Mark HarrisDepartment of Economics, Curtin University

Bruce HollingsworthDepartment of Economics, Lancaster University

William GreeneStern School of Business, New York University

Page 36: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

SAH vs. Objective Health MeasuresFavorable SAH categories seem artificially high.

60% of Australians are either overweight or obese (Dunstan et. al, 2001) 1 in 4 Australians has either diabetes or a condition of impaired glucose metabolism Over 50% of the population has elevated cholesterol Over 50% has at least 1 of the “deadly quartet” of health conditions (diabetes, obesity, high blood pressure, high cholestrol) Nearly 4 out of 5 Australians have 1 or more long term health conditions (National Health Survey, Australian Bureau of Statistics 2006) Australia ranked #1 in terms of obesity rates

Similar results appear to appear for other countries

Page 37: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

A Two Class Latent Class Model

True Reporter Misreporter

Page 38: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

• Mis-reporters choose either good or very good• The response is determined by a probit model

* m m mm x

Y=3

Y=2

Page 39: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Y=4

Y=3

Y=2

Y=1

Y=0

Page 40: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Observed Mixture of Two Classes

Page 41: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Pr(true,y) = Pr(true) * Pr(y | true)

Page 42: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Pr( ) Pr( ) Pr( | ) Pr( ) Pr( | )y true y true misreporter y misreporter

Page 43: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 44: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

General Result

Poor Fair Good Very Good Excellent0

0.050.1

0.150.2

0.250.3

0.350.4

SamplePredictedMis-Reporting

Page 45: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 46: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 47: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 48: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 49: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 50: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

RANDOM PARAMETER MODELS

Page 51: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

A Recast Random Effects Modeli

i

i

it i

i

1 2 ,1

+ u + , u ~ [0, ]u

T = observations on individual iFor each period, y 1[U 0] (given u )Joint probability for T observations is

Prob( , ,... | ) ( )

For co

U

i

it i it u

i

it

Ti i i it i i

it

tt

N

y y u F y

x

x

i u u

1 , u1

1

nvenience, write u = , ~ [0,1],

log | ,... log ( ( ) )

It is not possible to maximize log | ,... because ofthe unobserved random effects embedded in

i

i i i i

TNN it i iti i t

N

i

v v N v

L v v F y v

L v v

x

.

Page 52: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

A Computable Log Likelihood

1 1

u

log log ( , )

Maximize this function

The unobserved heterogeneity is ave

with respect to , , . ( )How to compute the integral?(1) Analytically

raged u

? N

o

ti

i

TNit i it i ii t

i u i

L F y f d

v

x

o, no formula exists.(2) Approximately, using Gauss-Hermite quadrature(3) Approximately using Monte Carlo simulation

Page 53: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Simulation

ii

Tit i it

Nii 1

Nii 1

Ni

2i

t

i

1

i

i1

logL log d

= log d

This ]The expected value of the function of can

equal be a

-

p

F(y , )

g(

proximas log

tedby

1 exp

dra

w

)

g

in

2

g

(

E2

)

a

[

R r

x

i

ir

i

TN RS i

r u ir

t u ir iti 1 r 1 t 1

ndom draws v from the population N[0,1] andaveraging the R functions of v . We

1logL log F(y ,( v )

maxim

)R

ize

x

Page 54: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Random Effects Model: Simulation----------------------------------------------------------------------Random Coefficients Probit ModelDependent variable DOCTOR (Quadrature Based)Log likelihood function -16296.68110 (-16290.72192) Restricted log likelihood -17701.08500Chi squared [ 1 d.f.] 2808.80780Simulation based on 50 Halton draws--------+-------------------------------------------------Variable| Coefficient Standard Error b/St.Er. P[|Z|>z]--------+------------------------------------------------- |Nonrandom parameters AGE| .02226*** .00081 27.365 .0000 ( .02232) EDUC| -.03285*** .00391 -8.407 .0000 (-.03307) HHNINC| .00673 .05105 .132 .8952 ( .00660) |Means for random parametersConstant| -.11873** .05950 -1.995 .0460 (-.11819) |Scale parameters for dists. of random parametersConstant| .90453*** .01128 80.180 .0000--------+-------------------------------------------------------------

Implied from these estimates is .904542/(1+.904532) = .449998.

Page 55: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Recast the Entire Parameter Vector

i i 1

i

1 2 ,1

ik

1

+ , , ~ [ , ( ,..., )]

Joint probability for T observations is

Prob( , ,... | ) ( )

For convenience, write u = , ~ [0,1],

log | ,.

U

.

i

i it it

i K

Ti i i it i itt

k ik ik ik k

i

k ik

t

N diag

y y u F y

v v N v

L v

xu u 0

x

,1

1

. log ( )

It is not possible to maximize log | ,... because ofthe unobserved random effects embedded in .

iTNN it i iti i t

N

i

v F y

L

x

v v

Page 56: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 57: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

~ N[ , ]Cholesky Decomposition: = where is upper triangular

with ~ N[ , ]Convenient Refinement: ( )( ) where the diagonal elements

i

i i i

u 0LL L

u Lv v 0 IMS MS

of = 1, and is diagonal

with free positive elements. (Cholesky values)

returns the original uncorrelated casei i

M S

u MSvM I

Page 58: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

S

M

Page 59: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

MSSM

Page 60: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Modeling Parameter Heterogeneity

i,k i k

Individual heterogeneity in the means of the parameters +

E[ | , ]Heterogeneity in the variances of the parametersVar[u | data] exp( )Estimation by maximum simulated likelihood

i i i

i i i

i k

β =β Δz uu X z

Page 61: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Hierarchical Probit ModelUit = 1i + 2iAgeit + 3iEducit + 4iIncomeit + it.

1i=1+11 Femalei + 12 Marriedi + u1i 2i=2+21 Femalei + 22 Marriedi + u2i 3i=3+31 Femalei + 32 Marriedi + u3i 4i=4+41 Femalei + 42 Marriedi + u4i

Yit = 1[Uit > 0]

All random variables normally distributed.

Page 62: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 63: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Simulating Conditional Means for Individual Parameters

, ,1 1

,1 1

1

1 ˆ ˆˆ ˆ( ) (2 1)( )ˆ ( | , ) 1 ˆ ˆ (2 1)( )

1 ˆˆ =

i

i

TRi r it i r itr t

i i i TRit i r itr t

Rir irr

yRE

yR

WeightR

Lw Lw xy X

Lw x

Posterior estimates of E[parameters(i) | Data(i)]

Page 64: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Probit

Page 65: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 66: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013
Page 67: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

“Individual Coefficients”

Page 68: Empirical Methods for  Microeconomic Applications University of Lugano, Switzerland May 27-31, 2013

Mixed Model Estimation

• WinBUGS: • MCMC • User specifies the model – constructs the Gibbs Sampler/Metropolis Hastings

• MLWin:• Linear and some nonlinear – logit, Poisson, etc.• Uses MCMC for MLE (noninformative priors)

• SAS: Proc Mixed. • Classical• Uses primarily a kind of GLS/GMM (method of moments algorithm for loglinear models)

• Stata: Classical• Several loglinear models – GLAMM. Mixing done by quadrature.• Maximum simulated likelihood for multinomial choice (Arne Hole, user provided)

• LIMDEP/NLOGIT• Classical• Mixing done by Monte Carlo integration – maximum simulated likelihood• Numerous linear, nonlinear, loglinear models

• Ken Train’s Gauss Code• Monte Carlo integration• Mixed Logit (mixed multinomial logit) model only (but free!)

• Biogeme• Multinomial choice models• Many experimental models (developer’s hobby)

Programs differ on the models fitted, the algorithms, the paradigm, and the extensions provided to the simplest RPM, i = +wi.