multinomial models for discrete outcomes

8/22/2019 Multinomial Models for Discrete Outcomes

1/8


people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

Multinomial Models for Discrete Outcomes

/* This file estimates models based on the multinomial distribution. A common mistake is to estimate

naturally ordered data with MNL or MNP. First, we estimate a multinomial logit (MNL)

for data on brand of a product chosen by consumers. Brand 1 is coded 1, brand 2 is coded 2, and brand 3 is

coded 3.*/

/*Using this file requires installation of dmlogit2 and the Hausman and Small-Hsiao IIA modules. It also uses

SPOST which you should have already installed. Run a net search from within STATA on the following

commands and install the proper files:

dmlogit2

iia

smhsiao

*/

/*Let's load the data*/clear

set more off

use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\brand.dta", clear

summarize

/*Multinomial Logit */

mlogit brand female age, base(1)

/* The output from the above has two parts, labeled with the categories of the outcome variable brand. Theycorrespond to

two equations:

log(P(brand=2)/P(brand=1)) = b_10 + b_11*female + b_12*age

log(P(brand=3)/P(brand=1)) = b_20 + b_21*female + b_22*age,

with the b's being the raw regression coefficients from the output.

For example, we can say that for one unit change in the variable age, the log of the ratio of the two probabilities,

P(brand=2)/P(brand=1), will be increased by 0.368, and the log of the ratio of the two probabilities

P(brand=3)/P(brand=1)

will be increased by 0.686. Therefore, we can say that, in general, the older a person is, the more he/she will

prefer

brand 2 or 3 over brand 1. */

/*The option "base(1)" sets brand 1 as the comparison group */

/* Note that MNL produces J-1 sets of coefficient estimates for J choices. Typically, these are then used to


2/8



compute

probabilities and marginal effects. */

/*Stata does not automatically report the predicted probabilities. However, we can retrieve them in the followin

manner*/

predict p1 p2 p3, p /*Calculate p(y=1) for each y */

predict z1, xb outcome(1) /*Calculate z for each y */predict z2, xb outcome(2) /*Calculate z for each y */

predict z3, xb outcome(3) /*Calculate z for each y */

list brand z1 p1 z2 p2 z3 p3 /*List y, predicted probabilities, and z */

/* We can interpret the regression result graphically. For example, we can use the three variables p1, p2 and p3

for the

predicted probabilities and plot them against a predictor variable. In the example below, we plot p1 against age

separated

by the variable female. Be patient. The graph may take some time the first time you run it.*/

sort age

line p1 age if female ==0 || line p1 age if female==1, legend(order(1 "male" 2 "female")) ytitle(P(Ford))

/*The ratio of the probability of choosing one outcome category over the probability of choosing the reference

category

is often referred as relative risk (and it is also sometimes referred as odds). So another way of interpreting the

regression results is in terms of relative risk. We can say that for one unit change in the variable age,

we expect the relative risk of choosing brand 2 over 1 to increase by exp(.3682) = 1.45. So we can saythat the relative risk is higher for older people. For a dichotomous predictor variable such as female, we can say

that the ratio of the relative risks of choosing brand 2 over 1 for female and male is exp(.5238) = 1.69.

We can use the rrr option for mlogit command to display the regression results in the language of risk.*/

mlogit, rrr

/*We can also obtain other helpful statistics*/

fitstat /*Obtain various fit statistics on the logit */

listcoef, help /*List the coefficients and risk ratios/factor changes.*/

prtab female /* Calculate the probabilities of females choosing a

particular brand.*/

prvalue, x(female=0) rest(mean) /*Compute p(brand) when female=0, and rest at mean. Should be same

as above. */

prvalue, x(female=1) rest(mean) /*Compute p(brand) when female=1, and rest at mean. Should be same

as above. */


3/8



prchange, help /*Compute various first differences */

dmlogit2 brand female age, base(1) /*The model in derivatives with all variables at means*/

/* Reestimate the model for an alternative procedure to calculate derivatives */


mfx compute, predict(outcome(2)) /*The model in derivatives switching female from zero to one*/

mfx compute, predict(outcome(3))

/*Test for Independence of Irrelevant Alternatives in MNL model with SPOST and mlogtest. By the way, this

procedure does a variety

tests, including whether categories can be combined or not. You can ask for the tests individually, but the option

all gives you

everything. Use "search mlogtest" to get a listing of the tests.*/


mlogtest, all

/* Note that omitting the category, Toyota, finds that this model violates the IIA assumption for MNL using the

Hausman test,

but not for the Small-Hsiao test. What to do? */

/*Alternatively, there are other user defined procedures for doing the Hausman and Small-Hsiao tests. */


iia

/* Notice that this user defined procedure breaks down. */

smhsiao brand female age, elim(2)

/* This user defined procedure again finds no violation when omitting Toyota.*/

/*Now let's drop the predicted probabilities*/

drop z1 p1 z2 p2 z3 p3

/* STATA also offers two versions of multinomial probit. The first one below assumes no correlation between

errors

and homoskedasticity. */

mprobit brand female age, base(1)

/* STATA also offers an ability to parameterize the sigma matrix using asmprobit. However, the data need to be

arranged as for conditional logit below. */


4/8



/* MNL is inefficient when the dependent variable is naturally ordered. Let's use Ray Fairs data on

marriage satisfaction and extramarital affairs.*/

/* Now estimate the ordered probit */

use "/users/bdanwood/documents/my teaching/maximum likelihood//Data/affairs.dta", clear

replace affairs = affairs>0summ

oprobit ratingmarriage age yrsmarried religiousness affairs

predict p1 p2 p3 p4 p5, p /*Calculate p(y=1) for each y */

predict z, xb /*Calculate z for each y */

list ratingmarriage z p1 p2 p3 p4 p5 /*List y, predicted probabilities, and z */


listcoef, help /*List the coefficients and standardized coefficients*/

prvalue, x(affairs=0) rest(mean) /*Compute p(ratingmarriage|j) when affairs=0, and rest at mean */


prchange /*Compute first differences for all variables

mfx compute, predict(outcome(1)) /*The model in derivatives for each outcome*/





/* Test the Parallel Regressions Assumption */

omodel probit ratingmarriage age yrsmarried religiousness affairs

/* Test the parallel regressions assumption of ordered probit. */

omodel probit ratingmarriage age yrsmarried religiousness affairs

/*Again, let's drop the predicted probabilities*/

drop z p1 p2 p3 p4 p5

/* It is also possible to obtain ordered logit estimates. These are rarely seen in the literature. However, the

following illustrates the

ordered logit procedure. */

ologit ratingmarriage age yrsmarried religiousness affairs

/* Note that with ordered probit there is only one set of coefficient estimates. This is more appealing, but

CAUTION is still


5/8



required in interpreting the coefficients. Ordered probit coefficients can have opposite signs from the

marginal effects. This is because increasing X, while holding the coefficient and threshold estimates constant

actually shifts

the distribution to the right. This may decrease the probability associated with particular outcomes. See Greene,

pp. 736-40 for

an example. */

/*We can also obtain predicted probabilities similar to the MNL above*/

predict p1 p2 p3 p4 p5, p /*Calculate p(y=1) for each y */

predict z, xb /*Calculate z for each y */

list ratingmarriage z p1 p2 p3 p4 p5 /*List y, predicted probabilities, and z */





prchange /*Compute first differences for all variables

mfx compute, predict(outcome(1)) /*The model in derivatives for each outcome*/





/* Test the parallel regressions assumption of ordered probit. */

omodel logit ratingmarriage age yrsmarried religiousness affairs

brant, detail

/*Again, let's drop the predicted probabilities*/

drop z p1 p2 p3 p4 p5

/*Now let's take a look at conditional logit. To do this, we will need to use a new data set. Note the

arrangement of the

data where there are j choices for each individual reflected by a different line of data.*/

use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\clogit.dta", clear

set more off

/* This file illustrates an application of STATA's DISCRETE CHOICE procedure, conditional logit. The model

is one of choice of

mode for transportation, for a sample of individuals who travel between Sydney and Melbourne, Australia. The

four choices are


6/8



Air, Train, Bus, and Car.

Note the special set-up for the data for DISCRETE CHOICE models. The data consist of observations on

choices, rather than

on individuals. For each individual there are j possible choices, so that there are j*n observations in the data set.

Basically, the

data setup is similar to that for panel data. See the STATA help system or manual for more on setting up the data

for DISCRETECHOICE models.

Conditional Logit is commonly used when the decisionmaker chooses primarily on the basis of attributes of the

choices, rather

than attributes of the individual. This approach is illustrated in the program below.*/

/* Below we estimate a conditional logit model for choice of mode of transport using choice specific constants

and a generalized

measure of perceived cost of the method of transportation, household income, and terminal waiting time. Note

that unlike LIMDEP

we must specify the group ourselves, in this case "group(id)"*/

clogit mode aasc tasc basc gc ttme hinca, group(id)

predict prob, p /*Calculate p(y=1) for each y */

predict zi, xb /*Calculate z for each y */

list mode prob zi /*List y, predicted probabilities, and z */



/* With conditional logit as opposed to MNL, we get a single set of coefficient estimates. This is an appealing

attribute of the model.

The preceding estimates say simply that people choose their transportation mode based on perceived cost and a

set of unmeasured

attributes associated with the choices. Note that perceived cost is an attribute of the choice, rather than an

attribute of the individual.

However, it might be reasonable to assert that the decisionmaker's income is also relevant to the choice. The

decisionmaker's incomedoes not vary across the alternatives. As a result, the discrete choice probabilities are homogeneous of degree

zero in the parameters.

That bit of jargon means that if there are any attributes that are the same for all outcomes for every individual,

they drop out of the

probability model.

Thus, for example, any individual characteristic such as age, or income, will cause this problem. The only way

such variables can be


7/8



brought into the conditional logit model is by interacting them with the choice specific constants. This is similar to

analysis of

covariance in regression.*/

/*Test for IIA using conditional logit setup */

clogit mode aasc tasc basc gc ttme hinca, group(id)

est store allclogit mode aasc tasc basc gc ttme hinca if choice ~=1, group(id)

est store partial

hausman partial all, alleqs constant

/* If there is a violation of IIA, as shown here, then you can use STATA's asmprobit procedure to do

multinomial probit. This

procedure allows you to take into account different correlation structures and variances for the random utility

equations.

For example, the following sets all of the correlations equal and scales two of the variances to one.*/

asmprobit mode gc ttme, casevars(hinc) case(id) alternatives(choice) correlation(exchangeable)

/*Now lets do a Nested Logit which can avoid the problem of IIA, as well as allows modeling a tree-like

probability

model where choices are conditioned by prior choices.*/

/* First, an example of Nested Logit in which our primary interest is in estimating the determinants of the

branches. */

clear

webuse restaurant

nlogitgen type = restaurant(fast: Freebirds | MamasPizza, family: CafeEccell | LosNortenos | WingsNmore,

fancy: Christophers | MadCows)

nlogittree restaurant type, choice(chosen)

nlogit chosen cost distance rating || type: income kids, base(family) || restaurant:, noconst case(family_id)

/* Now an example using the earlier data where we are interested in both the branches and twigs. */

use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\clogit.dta", clear

set more off

sort id

by id: gen travel=_n-1

nlogitgen type=choice(fly:Air, land: Car|Train|Bus)

nlogittree choice type, choice(mode)

nlogit mode gc ttme || type: hinc invt, base(1) || choice: psize , base(0) noconst case(id)


8/8



clear

multinomial models for discrete outcomes

Documents