annotated stata output_ multinomial logistic regression.pdf
TRANSCRIPT
-
5/17/2015 AnnotatedStataOutput:MultinomialLogisticRegression
http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm 1/4
givingagiftHelptheStatConsultingGroupby
StataAnnotatedOutputMultinomialLogisticRegressionThispageshowsanexampleofanmultinomiallogisticregressionanalysiswithfootnotesexplainingtheoutput.Thedatawerecollectedon200highschoolstudentsandarescoresonvarioustests,includingscience,math,readingandsocialstudies.Theoutcomemeasureinthisanalysisissocioeconomicstatus(ses)low,mediumandhighfromwhichwearegoingtoseewhatrelationshipsexistswithsciencetestscores(science),socialsciencetestscores(socst)andgender(female).Ourresponsevariable,ses,isgoingtobetreatedascategoricalundertheassumptionthatthelevelsofsesstatushavenonaturalorderingandwearegoingtoallowStatatochoosethereferentgroup,middleses.Thefirsthalfofthispageinterpretsthecoefficientsintermsofmultinomiallogodds(logits)andthesecondhalfinterpretsthecoefficientsintermsofrelativeriskratios.
usehttp://www.ats.ucla.edu/stat/data/hsb2,clear
mlogitsessciencesocstfemale
Iteration0:loglikelihood=210.58254Iteration1:loglikelihood=194.75041Iteration2:loglikelihood=194.03782Iteration3:loglikelihood=194.03485Iteration4:loglikelihood=194.03485
MultinomiallogisticregressionNumberofobs=200LRchi2(6)=33.10Prob>chi2=0.0000Loglikelihood=194.03485PseudoR2=0.0786
ses|Coef.Std.Err.zP>|z|[95%Conf.Interval]+low|science|.0235647.02097471.120.261.0646744.017545socst|.0389243.01951651.990.046.0771759.0006726female|.8166202.39098132.090.037.0503111.582929_cons|1.9122561.1272561.700.090.29712584.121638+high|science|.022922.02087181.100.272.0179861.0638301socst|.0430036.01988942.160.031.0040211.081986female|.032862.35001530.090.925.7188793.6531553_cons|4.0573231.2229393.320.0016.454241.660407(ses==middleisthebaseoutcome)
IterationLoga
Iteration0:loglikelihood=210.58254Iteration1:loglikelihood=194.75041Iteration2:loglikelihood=194.03782Iteration3:loglikelihood=194.03485Iteration4:loglikelihood=194.03485
a.Thisisalistingoftheloglikelihoodsateachiteration.Rememberthatmultinomiallogisticregression,likebinaryandorderedlogisticregression,usesmaximumlikelihoodestimation,whichisaniterativeprocedure.Thefirstiteration(callediteration0)istheloglikelihoodofthe"null"or"empty"modelthatis,amodelwithnopredictors.Atthenextiteration,thepredictor(s)areincludedinthemodel.Ateachiteration,theloglikelihooddecreasesbecausethegoalistominimizetheloglikelihood.Whenthedifferencebetweensuccessiveiterationsisverysmall,themodelissaidtohave"converged",theiteratingstops,andtheresultsaredisplayed.Formoreinformationonthisprocessforbinaryoutcomes,seeRegressionModelsforCategoricalandLimitedDependentVariablesbyJ.ScottLong(page5261).
ModelSummary
MultinomiallogisticregressionNumberofobsc=200LRchi2(6)d=33.10Prob>chi2e=0.0000Loglikelihood=194.03485bPseudoR2f=0.0786
b.LogLikelihoodThisistheloglikelihoodofthefittedmodel.ItisusedintheLikelihoodRatioChiSquaretestofwhetherallpredictors'regressioncoefficientsinthemodelaresimultaneouslyzeroandintestsofnestedmodels.
c.NumberofobsThisisthenumberofobservationsusedinthemultinomiallogisticregression.Itmaybelessthanthenumberofcasesinthedatasetiftherearemissingvaluesforsomevariablesintheequation.Bydefault,Statadoesalistwisedeletionofincompletecases.
>stata_mlogit_output.htm>stat >stata output
-
5/17/2015 AnnotatedStataOutput:MultinomialLogisticRegression
http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm 2/4
d.LRchi2(6)ThisistheLikelihoodRatio(LR)ChiSquaretestthatforbothequations(lowsesrelativetomiddlesesandhighsesrelativetomiddleses)atleastoneofthepredictors'regressioncoefficientisnotequaltozero.ThenumberintheparenthesesindicatesthedegreesoffreedomoftheChiSquaredistributionusedtotesttheLRChiSquarestatisticandisdefinedbythenumberofmodelsestimated(2)timesthenumberofpredictorsinthemodel(3).TheLRChiSquarestatisticcanbecalculatedby2*(L(nullmodel)L(fittedmodel))=2*((210.583)(194.035))=33.096,whereL(nullmodel)isfromtheloglikelihoodwithjusttheresponsevariableinthemodel(Iteration0)andL(fittedmodel)istheloglikelihoodfromthefinaliteration(assumingthemodelconverged)withalltheparameters.
e.Prob>chi2ThisistheprobabilityofgettingaLRteststatisticasextremeas,ormoreso,thantheobservedunderthenullhypothesisthenullhypothesisisthatalloftheregressioncoefficientsacrossbothmodelsaresimultaneouslyequaltozero.Inotherwords,thisistheprobabilityofobtainingthischisquarestatistic(33.10)ifthereisinfactnoeffectofthepredictorvariables.Thispvalueiscomparedtoaspecifiedalphalevel,ourwillingnesstoacceptatypeIerror,whichistypicallysetat0.05or0.01.ThesmallpvaluefromtheLRtest,|z|.Theinterpretationoftheparameterestimates'significanceislimitedonlytothefirstequation,lowsesrelativetomiddleses.Theinterpretationforthesecondmodel,highsesrelativetomiddleses,naturallyfallsoutofthefirstequationsinterpretation.
-
5/17/2015 AnnotatedStataOutput:MultinomialLogisticRegression
http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm 3/4
Forlowsesrelativetomiddleses,thezteststatisticforthepredictorscience(0.024/0.021)is1.12withanassociatedpvalueof0.261.Ifwesetouralphalevelto0.05,wewouldfailtorejectthenullhypothesisandconcludethatforlowsesrelativetomiddleses,theregressioncoefficientforsciencehasnotbeenfoundtobestatisticallydifferentfromzerogivensocstandfemaleareinthemodel.Forlowsesrelativetomiddleses,thezteststatisticforthepredictorsocst(0.039/0.020)is1.99withanassociatedpvalueof0.046.Ifweagainsetouralphalevelto0.05,wewouldrejectthenullhypothesisandconcludethattheregressioncoefficientforsocsthasbeenfoundtobestatisticallydifferentfromzeroforlowsesrelativetomiddlesesgiventhatscienceandfemaleareinthemodel.Forlowsesrelativetomiddleses,thezteststatisticforthepredictorfemale(0.817/0.391)is2.09withanassociatedpvalueof0.037.Ifweagainsetouralphalevelto0.05,wewouldrejectthenullhypothesisandconcludethatthedifferencebetweenmalesandfemaleshasbeenfoundtobestatisticallydifferentforlowsesrelativetomiddlesesgiventhatscienceandfemaleareinthemodel.Forlowsesrelativetomiddleses,thezteststatisticfortheintercept,_cons(1.912/1.129)is1.70withanassociatedpvalueof0.090.Withanalphalevelof0.05,wewouldfailtorejectthenullhypothesisandconclude,a)thatthemultinomiallogitformales(thevariablefemaleevaluatedatzero)andwithzeroscienceandsocsttestscoresinlowsesrelativetomiddlesesarefoundnottobestatisticallydifferentfromzeroorb)formaleswithzeroscienceandsocsttestscores,youarestatisticallyuncertainwhethertheyaremorelikelytobeclassifiedaslowsesormiddleses.Wecanmakethesecondinterpretationwhenweviewthe_consasaspecificcovariateprofile(maleswithzeroscienceandsocsttestscores).Basedonthedirectionandsignificanceofthecoefficient,the_constellswhethertheprofilewouldhaveagreaterpropensitytofallinoneofthelevelsofthedependentvariable.
l.[95%Conf.Interval]ThisistheConfidenceInterval(CI)foranindividualmultinomiallogitregressioncoefficientgiventheotherpredictorsareinthemodelforoutcomemrelativetothereferentgroup.Foragivenpredictorwithalevelof95%confidence,we'dsaythatweare95%confidentthatthe"true"populationmultinomiallogitregressioncoefficientliesbetweenthelowerandupperlimitoftheintervalforoutcomemrelativetothereferentgroup.ItiscalculatedastheCoef.(z/2)*(Std.Err.),wherez/2isacriticalvalueonthestandardnormaldistribution.TheCIisequivalenttothezteststatistic:iftheCIincludeszero,we'dfailtorejectthenullhypothesisthataparticularregressioncoefficientiszerogiventheotherpredictorsareinthemodel.AnadvantageofaCIisthatitisillustrativeitprovidesarangewherethe"true"parametermaylie.
RelativeRiskRatioInterpretationThefollowingistheinterpretationofthemultinomiallogisticregressionintermsofrelativeriskratiosandcanbeobtainedbymlogit,rrrafterrunningthemultinomiallogitmodelorbyspecifyingtherrroptionwhenthefullmodelisspecified.Thispartoftheinterpretationappliestotheoutputbelow.
mlogitsessciencesocstfemale,rrr
Iteration0:loglikelihood=210.58254Iteration1:loglikelihood=194.75041Iteration2:loglikelihood=194.03782Iteration3:loglikelihood=194.03485Iteration4:loglikelihood=194.03485
MultinomiallogisticregressionNumberofobs=200LRchi2(6)=33.10Prob>chi2=0.0000Loglikelihood=194.03485PseudoR2=0.0786
ses|RRRaStd.Err.zP>|z|[95%Conf.Interval]b+low|science|.9767108.02048621.120.261.93737261.0177socst|.9618236.01877141.990.046.925727.9993276female|2.262839.88472762.090.0371.0515984.869199+high|science|1.023187.02135581.100.272.98217471.065911socst|1.043942.02076332.160.0311.0040291.085441female|.9676721.33870.090.925.48729811.921595(ses==middleisthebaseoutcome)
a.RelativeRiskRatioThesearetherelativeriskratiosforthemultinomiallogitmodelshownearlier.Theycanbeobtainedbyexponentiatingthemultinomiallogitcoefficients,ecoef.,orbyspecifyingtherrroption.Recallthatthemultinomiallogitmodelestimatesk1models,wherethekthequationisrelativetothereferentgroup.Ifthemodelwastobewrittenoutinanexponentiatedformwherethepredictorofinterestisevaluatedatx+andatxforoutcomemrelativetoreferentgroup,whereisthechangeinthepredictorweareinterestedin(istraditionallyissettoone)whiletheothervariablesinthemodelareheldconstant.Ifwethentaketheirratio,theratiowouldreducetotheratiooftwoprobabilities,therelativerisk.Inthissense,theexponentiatedmultinomiallogitcoefficientprovidesanestimateofrelativerisk.However,theexponentiatedcoefficientarecommonlyinterpretedasoddsratios.Standardinterpretationoftherelativeriskratiosisforaunitchangeinthepredictorvariable,therelativeriskratioofoutcomemrelativetothereferentgroupisexpectedtochangebyafactoroftherespectiveparameterestimategiventhevariablesinthemodelareheldconstant.
lowsesrelativetomiddleses
scienceThisistherelativeriskratioforaoneunitincreaseinsciencescoreforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Ifasubjectweretoincreasehersciencetestscorebyoneunit,therelativeriskforlowsesrelativetomiddleseswouldbeexpectedtodecreasebyafactorof0.977giventheothervariablesinthemodelareheldconstant.So,givenaoneunitincreaseinscience,therelativeriskofbeinginthelowsesgroupwouldbe0.977timesmorelikelywhentheothervariablesinthemodelareheldconstant.Moregenerally,wecansaythatifasubjectweretoincreasetheirsciencetestscore,they'dbeexpectedtofallintomiddlesesascomparedtolowses.
socstThisistherelativeriskratioforaoneunitincreaseinsocstscoreforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Ifasubjectweretoincreasehersocsttestscorebyoneunit,therelativeriskforlowsesrelativetomiddleseswouldbeexpectedtodecreasebyafactorof0.962giventheothervariablesinthemodelareheldconstant.
femaleThisistherelativeriskratiocomparingfemalestomalesforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Forfemalesrelativetomales,therelativeriskforlowsesrelativetomiddleseswouldbeexpectedtoincreasebyafactorof2.263giventheothervariablesinthemodelareheldconstant.
highsesrelativetomiddleses
scienceThisistherelativeriskratioforaoneunitincreaseinsciencescoreforhighsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Ifasubjectweretoincreasehersciencetestscorebyoneunit,therelativeriskforhighsesrelativetomiddleseswouldbeexpectedtoincreasebyafactorof1.023giventheothervariablesinthemodelareheldconstant.
socstThisistherelativeriskratioforaoneunitincreaseinsocstscoreforhighsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Ifasubjectweretoincreasetheirsocsttestscorebyoneunit,therelativeriskforhighsesrelativetomiddleseswouldbe
-
5/17/2015 AnnotatedStataOutput:MultinomialLogisticRegression
http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm 4/4
Howtocitethispage Reportanerroronthispageorleaveacomment
expectedtoincreasebyafactorof1.043giventheothervariablesinthemodelareheldconstant.
femaleThisistherelativeriskratiocomparingfemalestomalesforhighsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelareheldconstant.Forfemalesrelativetomales,therelativeriskforhighsesrelativetomiddleseswouldbeexpectedtodecreasebyafactorof0.968giventheothervariablesinthemodelareheldconstant.
b.[95%Conf.Interval]ThisistheCIfortherelativeriskratiogiventheotherpredictorsareinthemodel.Foragivenpredictorwithalevelof95%confidence,we'dsaythatweare95%confidentthatthe"true"populationrelativeriskratiocomparingoutcomemtothereferentgroupliesbetweenthelowerandupperlimitoftheinterval.AnadvantageofaCIisthatitisillustrativeitprovidesarangewherethe"true"relativeriskratiomaylie.
Thecontentofthiswebsiteshouldnotbeconstruedasanendorsementofanyparticularwebsite,book,orsoftwareproductbytheUniversityofCalifornia.
I D R E R E S E A R C H T E C H N O L O G YG R O U P
High PerformanceComputing
Statistical Computing
GIS and Visualization
HighPerformanceComputing GIS StatisticalComputing
Hoffman2Cluster Mapshare Classes
Hoffman2AccountApplication Visualization Conferences
Hoffman2UsageStatistics 3DModeling ReadingMaterials
UCGridPortal TechnologySandbox IDREListserv
UCLAGridPortal TechSandboxAccess IDREResources
SharedCluster&Storage DataCenters SocialSciencesDataArchive
AboutIDRE
ABOUT CONTACT NEWS EVENTS OUR EXPERTS
2015 UC Regents Terms of Use & Privacy Policy