Transcript

Error Component models

Ric ScarpaPrepared for the Choice Modelling Workshop

1st and 2nd of MayBrisbane Powerhouse,

New FarmBrisbane

Presentation structure

• The basic MNL model

• Types of Heteroskedasticy in logit models

• Structure of error components

• Estimation

• Applications in env. economics– Flexible substitution patterns– Choice modeling

• Future perspectives (debate)

ML – RUM Specification

• The utility from individual i choosing alternative j is given by:

, , assume =1 and linearity

1, ,

ij ij ij

ij ij

U V x

x j J

Assume error is Gumbel

~ extreme valueij iid

i.e., has pdf and cdf, respectively, ofij

exp exp expij ij ijf

exp expij ijF

ML Choice Probabilities

• Given the distributional assumptions and representative agent specification, then defining

1

0 otherwise

ij ik

ij

U U k jy

we have that:

Pr 1| ,ij ijP y x Pr | ,ij ikU U k j x

Pr | ,ik ij ij ikV V k j x

ML Choice Probabilities (cont’d)

Pr | , ,ij ij ik ij ij ik ijP V V k j x

Thus, we have the conditional choice probability:

exp exp ij ij ikk j

V V

|jij iP sP dsf s

exp exp expexp exp ij ikk j

s V dV ss s

Taking the expectation of this with respect to yields the unconditional choice probability:

ij

ML Choice Probabilities (cont’d)

exexp exp p exp expij ikk j

ijP dsV ss sV

exp exp expij ikk

s V V s ds

exp exp exp expij ikk

s V V s ds

Consider a change of variables

exp expt s dt s ds

ML Choice Probabilities (cont’d)

exp exxp pe expij ij ikk

P V V s ss d

0

exp exp ij ikk

dtV Vt

0

exp exp

exp

ij ikj

ij ikk

V

V

t V

V

0

exp exp ij ikk

dtV Vt

1

exp ij ikk

V V

exp

expij

ikk

V

V

Merits of ML Specification

• The log-likelihood model is globally concave in its parameters (McFadden, 1973)

• Choice probabilities lie strictly within the unit interval and sum to one

• The log-likelihood function has a relatively simple form

1 1

1 1

, ln

ln exp

n J

ij iji j

n J

ij ij iki j k

L y x y P

y V V

Utility Variance in ML Specifications

• Assumes that the unobserved sources of heterogeneity are independently and identically distributed across individuals and alternatives; i.e.,

2

| , , | , ,i iVar U X Var X

I

where and 1, ,i i iJU U U

22

26

• Dependent on , but basically homoskedastic in most applications

• This is a problem as it leads to biased estimates if variance of utilities actually varies in real life, which is likely phenomenon

• Because the effect is multiplicative bias is likely to be big

Scale heteroskedasticy

…or Gumbel error heteroskedasticity• SP/RP joint response analysis allowed for

minimal heteroskedasticty (variance switch from SP to RP): i=exp(×1i(RP))

• Choice complexity work introduced i=exp(’zi), where zi is measure of complexity of choice context i

• Respondent cognitive effort: n=exp(’sn), where sn is a measure of cognitive ability of respondent n

Scale Het. limitations

• While scale heteroskedasticity allows the treatment of heteroskedasticity in the choice-respondent context it does not allow heteroskedasticity across utilities in the same choice context

• People may inherently associate more utility variance with less familiar alternatives (e.g. unknown destinations, hypothetical alternatives) than with better known ones (e.g. frequently attended sites, status quo option)

Mixed logit

• The mixed logit model is defined as any model whose choice probabilities can be expressed as

|ij ijP L f d where is a logit choice probability; i.e., ijL

1

exp

exp

ij

ij J

ikk

VL

V

and is the density function for , with underlying parameters

|f

ijV denotes the representative utility function

Special Cases

• Case #1: MNL results if the density function is degenerate; i.e.,

|f

1|

0

bf b

b

1

exp

exp

ij

ij ij J

ikk

V bP b L b

V b

Special Cases

• Case #2: Finite mixture logit model results if the density function is discrete; i.e., |f

; 1, ,|

0 otherwisem ms b m M

f

1

1

1

exp

exp

M

ij m ij mm

Mij m

m Jm

ik mk

P s L b

V bs

V b

Notes on Mixed Logit (MXL)

• Train emphasizes two interpretations of the MXL model– Random parameters (variation of taste intensities)

– Error components (heteroskedastic utilities)

• Mixed logit probabilities are simply weighted average of logit probabilities, with weights given by |f

• The goal of the research is to estimate the underlying parameter vector

Simulation Estimation

• Simulation methods are typically used to estimate mixed logit models

• Recall that the choice probabilities are given by

|ij ijP L f d

where

1

exp

exp

ij

ij J

ikk

VL

V

Simulation Estimation(cont’d)

which can then be used to compute

1 1

1 1

1

exp

exp

rR R

ij iR rij ij iR R J

rr rik i

k

VP L

V

• For any given value of , one can generatedrawn from

, 1, ,ri r R

|f

Simulation Estimation

1

lnN

Rij

i

L P

• The simulated log-likelihood for the panel of t choices becomes:

1

1 1 1

1

expln

exp

rN R

ijt i

R Jri r t

ikt ik

V

V

Error Components Interpretation

• The mixed logit model is generated in the RUM model by assuming that

,ij ij ij ijU V x

where

ij i ij ijx

with xij and both observed, ijx

~ EVij iid

and 0iE

Error Components Interpretation(cont’d)

• The error components perspective views the additional random terms as tools for inducing specific patterns of correlation across alternatives.

,ij ik i ij ij i ik ik

ij ik

Cov U U E z z

z z

where

ijVar

Example – Mimicking NL

• Consider a nesting structure

Stay at home (j=0)

Take a trip

Nest A Nest B

1 2 3 4

Example (cont’d)

The corresponding correlation structure among error components (and utilities) is given by

0 0 0 0

f f

f

a

b c

b f

d e

d

where c f

fe

Example (cont’d)

• We can build up this covariance structure using error components

0

1,2

3,4

ij ij

ij ij ij

ij ij

x j

U x j

x j

with~ EVij iid

i

i

2~ 0,i N

12i

12 21,2~ 0,i N

34i

34 23,4~ 0,i N

Example (cont’d)

• The resulting covariance structure becomes

2 2 2 2

2 2 2

2 21,2 1,2

21,

2

2

2

2

2

2 22 21,2 1,2

21,

22

2

0 0 0 0

ijVar U

Example (cont’d)

• One limitation of the NL model is that one has to fix the nesting structure

• MXL can be used to create overlapping nests

0 0

1 1

2 2

3 3

4 4

i i

i i

ij i i

i i

i i

x

x

U x

x

x

i

i

i

i

12i12i

34i

34i

13i

13i

14i

14i23i23i

24i

24i

Herriges and Phaneuf (2002)Covariance Pattern

1.88 2.14

-- 1.30

-- 0.64

1.61 1.09

-- 0.93

-- 0.58

1.61

-- 1.72

-- 0.08

-- 0.46

-- 0.35

-- 0.56

(1,2)(1,3)(1,4)

(1,5)( 2,3)

( 2,3,5)( 2,4)( 2,5)(3,4)(3,5)( 4,5)

Implications for Elasticity Patterns

• In general, elasticities given by

s

s

s

j ij ikjk ij

ik j ij

P x xx

x P X

,s

s

ij ij ik

ik j ij

L x f d x

x P x

,

s

ij ij ik

ik j ij

L X xf d

x P x

, , ,Lj ij jk ijw x x f d

Implications for Elasticity Patterns(cont’d)

where

,s s

Ljk ij jk ik s ikX L x

denotes the standard logit response elasticity (i.e., without nesting) conditional on a specific draw of the vector n

and

,,

ij ij

j ij

j ij

L xw x

P x

denotes the relative odds that alternative j is selected(i.e., conditional versus unconditional odds)

Illustration – Choice Probabilities

0.65 0.20 0.03 0.20 0.45

0.09 0.20 0.24 0.20 0.14

0i iL

; 0ij iL j

2i 0i 2i

j jP L f d

0.1 10

Choice modeling• Error component in hypothetical alternatives,

yet absent in the SQ or no alternative

The induced variance structure across utilities is:

Effect

• Fairly general result that it improves fit while requiring few additional parameters (only st. dev. of err. comp.)

• It can be decomposed by socio-economics covariates (e.g. spread of error varies across segments of respondents)

Adoption and state of practice

• Error component estimators have now been incorporated in commercial software (e.g. Nlogit 4)

• Given their properties and the flexibility they afford they are likely to be increasingly used in practice


Top Related