multinomial models for discrete outcomes

Upload: prakash-kumar-paul

Post on 08-Aug-2018

212 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    1/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    Multinomial Models for Discrete Outcomes

    /* This file estimates models based on the multinomial distribution. A common mistake is to estimate

    naturally ordered data with MNL or MNP. First, we estimate a multinomial logit (MNL)

    for data on brand of a product chosen by consumers. Brand 1 is coded 1, brand 2 is coded 2, and brand 3 is

    coded 3.*/

    /*Using this file requires installation of dmlogit2 and the Hausman and Small-Hsiao IIA modules. It also uses

    SPOST which you should have already installed. Run a net search from within STATA on the following

    commands and install the proper files:

    dmlogit2

    iia

    smhsiao

    */

    /*Let's load the data*/clear

    set more off

    use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\brand.dta", clear

    summarize

    /*Multinomial Logit */

    mlogit brand female age, base(1)

    /* The output from the above has two parts, labeled with the categories of the outcome variable brand. Theycorrespond to

    two equations:

    log(P(brand=2)/P(brand=1)) = b_10 + b_11*female + b_12*age

    log(P(brand=3)/P(brand=1)) = b_20 + b_21*female + b_22*age,

    with the b's being the raw regression coefficients from the output.

    For example, we can say that for one unit change in the variable age, the log of the ratio of the two probabilities,

    P(brand=2)/P(brand=1), will be increased by 0.368, and the log of the ratio of the two probabilities

    P(brand=3)/P(brand=1)

    will be increased by 0.686. Therefore, we can say that, in general, the older a person is, the more he/she will

    prefer

    brand 2 or 3 over brand 1. */

    /*The option "base(1)" sets brand 1 as the comparison group */

    /* Note that MNL produces J-1 sets of coefficient estimates for J choices. Typically, these are then used to

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    2/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    compute

    probabilities and marginal effects. */

    /*Stata does not automatically report the predicted probabilities. However, we can retrieve them in the followin

    manner*/

    predict p1 p2 p3, p /*Calculate p(y=1) for each y */

    predict z1, xb outcome(1) /*Calculate z for each y */predict z2, xb outcome(2) /*Calculate z for each y */

    predict z3, xb outcome(3) /*Calculate z for each y */

    list brand z1 p1 z2 p2 z3 p3 /*List y, predicted probabilities, and z */

    /* We can interpret the regression result graphically. For example, we can use the three variables p1, p2 and p3

    for the

    predicted probabilities and plot them against a predictor variable. In the example below, we plot p1 against age

    separated

    by the variable female. Be patient. The graph may take some time the first time you run it.*/

    sort age

    line p1 age if female ==0 || line p1 age if female==1, legend(order(1 "male" 2 "female")) ytitle(P(Ford))

    /*The ratio of the probability of choosing one outcome category over the probability of choosing the reference

    category

    is often referred as relative risk (and it is also sometimes referred as odds). So another way of interpreting the

    regression results is in terms of relative risk. We can say that for one unit change in the variable age,

    we expect the relative risk of choosing brand 2 over 1 to increase by exp(.3682) = 1.45. So we can saythat the relative risk is higher for older people. For a dichotomous predictor variable such as female, we can say

    that the ratio of the relative risks of choosing brand 2 over 1 for female and male is exp(.5238) = 1.69.

    We can use the rrr option for mlogit command to display the regression results in the language of risk.*/

    mlogit, rrr

    /*We can also obtain other helpful statistics*/

    fitstat /*Obtain various fit statistics on the logit */

    listcoef, help /*List the coefficients and risk ratios/factor changes.*/

    prtab female /* Calculate the probabilities of females choosing a

    particular brand.*/

    prvalue, x(female=0) rest(mean) /*Compute p(brand) when female=0, and rest at mean. Should be same

    as above. */

    prvalue, x(female=1) rest(mean) /*Compute p(brand) when female=1, and rest at mean. Should be same

    as above. */

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    3/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    prchange, help /*Compute various first differences */

    dmlogit2 brand female age, base(1) /*The model in derivatives with all variables at means*/

    /* Reestimate the model for an alternative procedure to calculate derivatives */

    mlogit brand female age, base(1)

    mfx compute, predict(outcome(2)) /*The model in derivatives switching female from zero to one*/

    mfx compute, predict(outcome(3))

    /*Test for Independence of Irrelevant Alternatives in MNL model with SPOST and mlogtest. By the way, this

    procedure does a variety

    tests, including whether categories can be combined or not. You can ask for the tests individually, but the option

    all gives you

    everything. Use "search mlogtest" to get a listing of the tests.*/

    mlogit brand female age, base(1)

    mlogtest, all

    /* Note that omitting the category, Toyota, finds that this model violates the IIA assumption for MNL using the

    Hausman test,

    but not for the Small-Hsiao test. What to do? */

    /*Alternatively, there are other user defined procedures for doing the Hausman and Small-Hsiao tests. */

    mlogit brand female age, base(1)

    iia

    /* Notice that this user defined procedure breaks down. */

    smhsiao brand female age, elim(2)

    /* This user defined procedure again finds no violation when omitting Toyota.*/

    /*Now let's drop the predicted probabilities*/

    drop z1 p1 z2 p2 z3 p3

    /* STATA also offers two versions of multinomial probit. The first one below assumes no correlation between

    errors

    and homoskedasticity. */

    mprobit brand female age, base(1)

    /* STATA also offers an ability to parameterize the sigma matrix using asmprobit. However, the data need to be

    arranged as for conditional logit below. */

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    4/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    /* MNL is inefficient when the dependent variable is naturally ordered. Let's use Ray Fairs data on

    marriage satisfaction and extramarital affairs.*/

    /* Now estimate the ordered probit */

    use "/users/bdanwood/documents/my teaching/maximum likelihood//Data/affairs.dta", clear

    replace affairs = affairs>0summ

    oprobit ratingmarriage age yrsmarried religiousness affairs

    predict p1 p2 p3 p4 p5, p /*Calculate p(y=1) for each y */

    predict z, xb /*Calculate z for each y */

    list ratingmarriage z p1 p2 p3 p4 p5 /*List y, predicted probabilities, and z */

    fitstat /*Obtain various fit statistics on the logit */

    listcoef, help /*List the coefficients and standardized coefficients*/

    prvalue, x(affairs=0) rest(mean) /*Compute p(ratingmarriage|j) when affairs=0, and rest at mean */

    prvalue, x(affairs=1) rest(mean) /*Compute p(ratingmarriage|j) when affairs=1, and rest at mean */

    prchange /*Compute first differences for all variables

    mfx compute, predict(outcome(1)) /*The model in derivatives for each outcome*/

    mfx compute, predict(outcome(2))

    mfx compute, predict(outcome(3))

    mfx compute, predict(outcome(4))

    mfx compute, predict(outcome(5))

    /* Test the Parallel Regressions Assumption */

    omodel probit ratingmarriage age yrsmarried religiousness affairs

    /* Test the parallel regressions assumption of ordered probit. */

    omodel probit ratingmarriage age yrsmarried religiousness affairs

    /*Again, let's drop the predicted probabilities*/

    drop z p1 p2 p3 p4 p5

    /* It is also possible to obtain ordered logit estimates. These are rarely seen in the literature. However, the

    following illustrates the

    ordered logit procedure. */

    ologit ratingmarriage age yrsmarried religiousness affairs

    /* Note that with ordered probit there is only one set of coefficient estimates. This is more appealing, but

    CAUTION is still

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    5/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    required in interpreting the coefficients. Ordered probit coefficients can have opposite signs from the

    marginal effects. This is because increasing X, while holding the coefficient and threshold estimates constant

    actually shifts

    the distribution to the right. This may decrease the probability associated with particular outcomes. See Greene,

    pp. 736-40 for

    an example. */

    /*We can also obtain predicted probabilities similar to the MNL above*/

    predict p1 p2 p3 p4 p5, p /*Calculate p(y=1) for each y */

    predict z, xb /*Calculate z for each y */

    list ratingmarriage z p1 p2 p3 p4 p5 /*List y, predicted probabilities, and z */

    fitstat /*Obtain various fit statistics on the logit */

    listcoef, help /*List the coefficients and standardized coefficients*/

    prvalue, x(affairs=0) rest(mean) /*Compute p(ratingmarriage|j) when affairs=0, and rest at mean */

    prvalue, x(affairs=1) rest(mean) /*Compute p(ratingmarriage|j) when affairs=1, and rest at mean */

    prchange /*Compute first differences for all variables

    mfx compute, predict(outcome(1)) /*The model in derivatives for each outcome*/

    mfx compute, predict(outcome(2))

    mfx compute, predict(outcome(3))

    mfx compute, predict(outcome(4))

    mfx compute, predict(outcome(5))

    /* Test the parallel regressions assumption of ordered probit. */

    omodel logit ratingmarriage age yrsmarried religiousness affairs

    brant, detail

    /*Again, let's drop the predicted probabilities*/

    drop z p1 p2 p3 p4 p5

    /*Now let's take a look at conditional logit. To do this, we will need to use a new data set. Note the

    arrangement of the

    data where there are j choices for each individual reflected by a different line of data.*/

    use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\clogit.dta", clear

    set more off

    /* This file illustrates an application of STATA's DISCRETE CHOICE procedure, conditional logit. The model

    is one of choice of

    mode for transportation, for a sample of individuals who travel between Sydney and Melbourne, Australia. The

    four choices are

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    6/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    Air, Train, Bus, and Car.

    Note the special set-up for the data for DISCRETE CHOICE models. The data consist of observations on

    choices, rather than

    on individuals. For each individual there are j possible choices, so that there are j*n observations in the data set.

    Basically, the

    data setup is similar to that for panel data. See the STATA help system or manual for more on setting up the data

    for DISCRETECHOICE models.

    Conditional Logit is commonly used when the decisionmaker chooses primarily on the basis of attributes of the

    choices, rather

    than attributes of the individual. This approach is illustrated in the program below.*/

    /* Below we estimate a conditional logit model for choice of mode of transport using choice specific constants

    and a generalized

    measure of perceived cost of the method of transportation, household income, and terminal waiting time. Note

    that unlike LIMDEP

    we must specify the group ourselves, in this case "group(id)"*/

    clogit mode aasc tasc basc gc ttme hinca, group(id)

    predict prob, p /*Calculate p(y=1) for each y */

    predict zi, xb /*Calculate z for each y */

    list mode prob zi /*List y, predicted probabilities, and z */

    fitstat /*Obtain various fit statistics on the logit */

    listcoef, help /*List the coefficients and standardized coefficients*/

    /* With conditional logit as opposed to MNL, we get a single set of coefficient estimates. This is an appealing

    attribute of the model.

    The preceding estimates say simply that people choose their transportation mode based on perceived cost and a

    set of unmeasured

    attributes associated with the choices. Note that perceived cost is an attribute of the choice, rather than an

    attribute of the individual.

    However, it might be reasonable to assert that the decisionmaker's income is also relevant to the choice. The

    decisionmaker's incomedoes not vary across the alternatives. As a result, the discrete choice probabilities are homogeneous of degree

    zero in the parameters.

    That bit of jargon means that if there are any attributes that are the same for all outcomes for every individual,

    they drop out of the

    probability model.

    Thus, for example, any individual characteristic such as age, or income, will cause this problem. The only way

    such variables can be

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    7/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    brought into the conditional logit model is by interacting them with the choice specific constants. This is similar to

    analysis of

    covariance in regression.*/

    /*Test for IIA using conditional logit setup */

    clogit mode aasc tasc basc gc ttme hinca, group(id)

    est store allclogit mode aasc tasc basc gc ttme hinca if choice ~=1, group(id)

    est store partial

    hausman partial all, alleqs constant

    /* If there is a violation of IIA, as shown here, then you can use STATA's asmprobit procedure to do

    multinomial probit. This

    procedure allows you to take into account different correlation structures and variances for the random utility

    equations.

    For example, the following sets all of the correlations equal and scales two of the variances to one.*/

    asmprobit mode gc ttme, casevars(hinc) case(id) alternatives(choice) correlation(exchangeable)

    /*Now lets do a Nested Logit which can avoid the problem of IIA, as well as allows modeling a tree-like

    probability

    model where choices are conditioned by prior choices.*/

    /* First, an example of Nested Logit in which our primary interest is in estimating the determinants of the

    branches. */

    clear

    webuse restaurant

    nlogitgen type = restaurant(fast: Freebirds | MamasPizza, family: CafeEccell | LosNortenos | WingsNmore,

    fancy: Christophers | MadCows)

    nlogittree restaurant type, choice(chosen)

    nlogit chosen cost distance rating || type: income kids, base(family) || restaurant:, noconst case(family_id)

    /* Now an example using the earlier data where we are interested in both the branches and twigs. */

    use "c:\users\wood\documents\My Teaching\Maximum Likelihood\Data\clogit.dta", clear

    set more off

    sort id

    by id: gen travel=_n-1

    nlogitgen type=choice(fly:Air, land: Car|Train|Bus)

    nlogittree choice type, choice(mode)

    nlogit mode gc ttme || type: hinc invt, base(1) || choice: psize , base(0) noconst case(id)

  • 8/22/2019 Multinomial Models for Discrete Outcomes

    8/8

    7/29/13 Multinomial Models for Discrete Outcomes

    people.tamu.edu/~b-wood/Maximum Likelihood/STATA7.htm

    clear