introductory guide to using stata

Upload: willie81

Post on 14-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 Introductory Guide to using Stata

    1/16

    Norris IntroductiontoStata 20091

    STM103Spring2009

    INTRODUCTION TO STATA

    1.DOWNLOADINGDATA:......................................................................................................................2

    Assignment1preparation..........................................................................................................................3

    Selectvariables...........................................................................................................................................3

    2.STARTINGASTATASESSION..............................................................................................................4

    Opening,Saving,andClosingthedatafile.................................................................................................5

    KeepingalogrecordofaStataSession......................................................................................................5

    UsingOperators..........................................................................................................................................5

    3.WORKINGWITHDATA.......................................................................................................................6

    Toseewhatyourselectedvariablescontain:............................................................................................6

    Tolookatthe%distributionincategoricalvariables................................................................................7

    Tolookatatablecombiningtwocategoricalvariables.............................................................................7

    CreatingandChangingValuesofVariables................................................................................................8LabelingVariablesandValues....................................................................................................................9

    UsingFunctions..........................................................................................................................................9

    DeletingVariablesandObservations.........................................................................................................9

    list .........................................................................................................................................................10

    Analysisofcontinuous(scale)variables...................................................................................................10

    Correlations..............................................................................................................................................10

    EstimatingLinearModels(OLSand2StageLeastSquares).....................................................................11

    EstimatingNonLinearModels(LogitandProbit)....................................................................................11

    logit .........................................................................................................................................................11

    probit........................................................................................................................................................12

    4.MAKINGGRAPHS.............................................................................................................................12

    Histogram.................................................................................................................................................12

    Scatterplot................................................................................................................................................12

    Bargraphs.................................................................................................................................................13

    Printingyourgraph...................................................................................................................................13

    5.UTILITIES.........................................................................................................................................13

    ViewingtheData......................................................................................................................................13

    CreatingandSubmittingaDoFile............................................................................................................13

    6.CONVERTINGDATAFILES(EXCELTOSTATA)....................................................................................14

    Thisprovidesabrief introductiontousingStatafortheQoGdatasetanalysis.Stata isavailableonallof

    thecomputers in theKennedySchoolscomputer lab. Ifyouhaveahomecomputeryoumaywant to

    purchaseacopyofStatafromtheCMO.StataisavailableforWindows98,Windows2000,WindowsME,

    Windows XP, Windows NT, Macintosh, and UNIX operating systems. The Stata Users Guide is also

    availablefromtheCMO.

    The commands outlined below assume that you are using Stata for Windows. Throughout this text,

    anythingappearing inBold font isaStatacommand,whereasanything in red italics isavariablename

    whichyoushouldchangeforyourspecificanalysis.Menucommandsareindicatedas,e.g.,File|Open,to

    indicatethatyoufirstgototheFilemenuandthenchoosetheOpenoption.TheBl ue Cour i er isthe

    typeofoutputyoushouldgenerate. Asashortcut,youcanalsojustcopyandpasteanyofthecommand

    linesheredirectlyintoyourStataCommandwindowthenrun.

  • 7/27/2019 Introductory Guide to using Stata

    2/16

    Norris IntroductiontoStata 20092

    1.DOWNLOADING DATA:

    GototheQoGwebsite:http://www.qog.pol.gu.se/GotoData:theQoGData

    2.Downloadthecorrectversionofthedataset.Tostartwork,downloadthecrosssectionaldatasetfor

    Stataandsavethissomewhereclearlylabeledonyourmemorystick,harddriveorsharedserverspace.

    3.AlsosavethePDFversionoftheFULLCodebooksomewheresafe;itisverylongbutaninvaluable

    referencedocument.

  • 7/27/2019 Introductory Guide to using Stata

    3/16

    Norris IntroductiontoStata 20093

    ASSIGNMENT 1 PREPARATION

    Theaimistowriteaprofessionalreportassessingandcomparingtheproblemsofdemocraticgovernance

    reforminoneworldregion.Pickyourregion:

    LatinAmerica

    and

    the

    Caribbean,

    Africa,

    Asia,

    CentralandEasternEurope,

    MiddleEast

    WesternEurope

    Thinkaboutthekeyproblemsofdemocraticgovernanceintheregion.Fromyourexperienceandyour

    reading,whataretheprioritiesforagencies?Canyourankthem?Focusonthemostimportant23issues

    inthefirstinstance.Thenlookcarefullyattheshareddatasetcodebook.Startbyselecting34indicators

    whichrelatetotheproblemsyouhavedecidedtofocusupon.Thesharedclassdatasetprovidesthe

    followingindicators,alongwithmanyothers:

    1.FreedomHouseindexofpoliticalrightsandcivilliberties

    2.Polity

    IV

    Project

    Democracy

    and

    Autocracy

    scales

    3.CheibubandGandhiDemocracyAutocracyclassification

    4.VanhanenDemocracyIndex

    5.WorldValuesSurvey/GlobalBarometersAttitudinalsurveys

    6.Kaufmann/KrayWorldBankInstituteGoodgovernanceindicators

    7.TransparencyInternationalCorruptionindex

    SELECT VARIABLES

    You can use the wholedataset withoutdoinganything furtherbut there are a LOTofvariables in the

    dataset.Tosimplifyyourlifeandmakeitlessconfusing,forthisexerciseyouwillfinditeasiertoidentifya

    subsetofkeyvariablesfromtheWhatItIslistwhichyouwanttouse.Youcanalwaysgoback to selectmorevariablesata later stage (nonearedeleted)but firstwork out

    whichvariablestoputintoyoursubseti.e.whatisimportantasindicatorsofdemocraticgovernancefor

    yourregion. NoteinStataitmatterswhethervariablenamesareincapitalsornot.

    Selectthefirstoneslistedbelowandpickabout510toadd.Writedowndetailsinthelistbelowsothat

    youhavethishandy.Look inthefullcodebookformoredetailsabouttheconstructionandmeaningof

    each.

    Name Briefdescription Type

    cname Countryname Nominal

    ht_region Globalregion:10categories Nominal

    chga_regime CheibubandGhandi:Typeofdemocraticorautocraticregime Nominal

    fh_status Freedom House: classification of states into free, semifree andnotfree

    Ordinal

    p_polity Combinedpolityscoreofdemocracyandautocracy Scale 10/+10

    fh_cl FreedomHousecivilliberties Scale7pt

    fh_pr FreedomHousepoliticalrights Scale7pt

    wbgi_vae WorldBank:Voiceandaccountabilityestimate Scale

    wbgi_pse WorldBank:Politicalstabilityestimate Scale

    wbgi_gee WorldBank:Governmenteffectiveness Scale

  • 7/27/2019 Introductory Guide to using Stata

    4/16

    Norris IntroductiontoStata 20094

    rsf_pfi ReporterswithoutBorders:PressFreedomIndex Scale100pts

    ti_cpi TransparencyInternational:CorruptionPerceptionIndex Scale100pts

    2.STARTING ASTATASESSION

    StartStatabylocatingthelinkonthedesktopandclicking.StataSE 8.lnk

    Stataisapowerfultoolforconductingstatisticalanalyses.HereiswhatatypicalsessioninStatalookslike.

    Thewindowscanbemovedaboutandresizedtosuityourpreferences.Ifyoudonotseeanyoftheseon

    yourversion,gotoWindowandaddtheseuntiltheylookroughlyliketheabove.

  • 7/27/2019 Introductory Guide to using Stata

    5/16

    Norris IntroductiontoStata 20095

    a.HeretheResultswindowliststheoutcome.

    b.TheVariableswindow,ontheleft,liststhenamesofallthevariablesincludedintheshareddataset.

    c.Youcanentercommandsintwoways.Tostartlearningtheprogram,youcanusethedropdown

    menus,similartothosecommoninMicrosoftprograms.Thisisusefulforbeginners.Onceyoubecomemorefamiliarwiththeprogram,youwillwanttotypeincommandsdirectly,tosavetime,usingthe

    Commandwindowatthebottomcenterofthescreen. AcommandtellsStatawhattodoe.g,toopena

    file,torunaregression,tocalculateameanofavariable,etc.

    d.TheReviewwindowshowsalistofallthecommandsyouhavealreadyrun.(Here,itshowsthatIhave

    openedadatafile.)IfyouclickonapreviouslyruncommandintheReviewwindow,itwillappearinthe

    Commandwindowandyoucanedititorrunitagain.

    OPENING, SAVING,AN DCLOSINGTH E DATA FILE

    Youwillthenneedtoopenthedatafileyouhavesaved.Youmayneedtoboostthememoryallocated.

    Setmemory

    80000

    File|Open

    File|SaveAs

    Itisalsoalwaysusefultohaveabackupcopyofyourdata.Thisway,nomatterwhatyoudotochangeor

    recodethevariables,youalwayshaveacopyoftheolderversion.Itisalsousefulpracticetosaveyour

    datafileattheendofeachsessionunderanewsequentialversion(egSTM103_2,STM103_3,sothatyou

    havetheoldandnewestfileincaseyouneedtorevertback.

    File|Exit

    Wheneveryoufinish,toexitStata.

    KEEPINGA LO GRECORD OF A STATA SESSION

    File|log

    Tosaveafile(log)ofyourresults,youwillneedtocreatealogfile.Statagivesyoutwochoicesoffile

    formatsforyourlogfile,.log(textfile)and.smcl(formattedlogfile).The.smclfileswilllooknicerwhen

    printed.YoushouldneverevercutandpasteyourStataoutputdirectlyintoyourreport;alwayssimplify,

    cleanandtransferinaprofessionalandcleanformat.

    Tostartalogfileinteractively,chooseFile|Log|Begin,selectthedirectoryyouwanttosavethelogfile

    in,andgiveitaname(suchasjob1).Alternately,youcanclickonthefourthiconfromtheleftontheicon

    bar,whichlookslikeascroll.

    File|Log|Close

    USINGOPERATORS

    Statausesthefollowingarithmeticoperators:

    + add

    subtract

    * multiply

    / divide

    ^ raisetothepower

  • 7/27/2019 Introductory Guide to using Stata

    6/16

    Norris IntroductiontoStata 20096

    Forrelations,Statauses:

    == equal

    ~= notequal

    > greaterthan

    = greaterthanorequalto

  • 7/27/2019 Introductory Guide to using Stata

    7/16

    Norris IntroductiontoStata 20097

    Firstletssummarizeyourselectedvariables.Type:

    sum ht_region p_polity fh_cl fh_pr

    Variable | Obs Mean Std. Dev. Min Max-------------+--------------------------------------------------------

    ht_region | 192 4.526042 2.644633 1 10

    p_polity | 160 1 14.45487 -77 10

    fh_cl | 192 3.385417 1.821166 1 7

    -------------+--------------------------------------------------------

    fh_pr | 192 3.364583 2.156785 1 7

    chga_regime | 189 .3968254 .4905386 0 1

    Thisisveryusefulforlookingatyourselectedvariablestoseewhattheyarelike,whethernominal,

    ordinalorscale(continuous).E.g.chga_regimeisabinary2 categoryvariable.Trythisforacoupleof

    yourvariablesandaddnotestoyourselectedvarsonpage2.

    summarizecanbeabbreviatedtosum.Youcanalsolookatmoredetail.eg

    sumht_region,detail

    Youcandothesamejustforoneselectedregion

    sump_polityifht_region==1,detail

    TO LOOK ATTH E%DISTRIBUTIONINCATEGORICAL VARIABLES

    Forcategoricalvariables,trythefollowingwhichgeneratessomesimplefrequenciesielookatthe

    numberofcountries(Freq)andthepercentcolumn.

    . tab1ht_region

    The Region of the Country | Freq. Percent Cum.

    ----------------------------------------+-----------------------------------

    1. Eastern Europe and post Soviet Union | 27 14.06 14.06

    2. Latin America | 20 10.42 24.48

    3. North Africa & the Middle East | 20 10.42 34.90

    4. Sub-Saharan Africa | 48 25.00 59.90

    5. Western Europe and North America | 27 14.06 73.96

    6. East Asia | 6 3.13 77.08

    7. South-East Asia | 11 5.73 82.81

    8. South Asia | 8 4.17 86.98

    9. The Pacific | 12 6.25 93.23

    10. The Caribbean | 13 6.77 100.00

    ----------------------------------------+-----------------------------------

    Total | 192 100.00

    TO LOOK ATATABLECOMBINING TW OCATEGORICALVARIABLES

  • 7/27/2019 Introductory Guide to using Stata

    8/16

    Norris IntroductiontoStata 20098

    tab2ht_region chga_regime, col

    -> tabulation of ht_region by chga_regime

    The Region of the | Type of RegimeCountry | 0. Democr 1. Dictat | Total

    ----------------------+----------------------+----------1. Eastern Europe and | 17 10 | 27| 14.91 13.33 | 14.29

    ----------------------+----------------------+----------2. Latin America | 17 3 | 20

    | 14.91 4.00 | 10.58----------------------+----------------------+----------3. North Africa & the | 3 17 | 20

    | 2.63 22.67 | 10.58----------------------+----------------------+----------4. Sub-Saharan Africa | 20 28 | 48

    | 17.54 37.33 | 25.40----------------------+----------------------+----------5. Western Europe and | 26 0 | 26

    | 22.81 0.00 | 13.76----------------------+----------------------+----------

    6. East Asia | 4 2 | 6| 3.51 2.67 | 3.17

    ----------------------+----------------------+----------7. South-East Asia | 3 7 | 10

    | 2.63 9.33 | 5.29----------------------+----------------------+----------

    8. South Asia | 3 5 | 8| 2.63 6.67 | 4.23

    ----------------------+----------------------+----------9. The Pacific | 8 3 | 11

    | 7.02 4.00 | 5.82----------------------+----------------------+----------

    10. The Caribbean | 13 0 | 13| 11.40 0.00 | 6.88

    ----------------------+----------------------+----------Total | 114 75 | 189

    | 100.00 100.00 | 100.00

    CREATING AN D CHANGING VALUESOF VARIABLES

    Recodeandgenerate

    Thishelpstocategorizethevalueofanexistingscalevariable.Forexample,takethevariablecalled

    p_politythatcontainsthe20pointPolityIVscaleforeachnationinyourstudy.Butyouwantthescale

    tobecollapsedintotwotypesofregime,democracies andautocracies.Todothis,generateanew

    variablecalledregimeandrecodeitasfollows:

    genregime=p_polity

    recoderegime 10/1=0 0/10=1

    sumregime

    Whenrecodingyourdata,becarefulnottooverwriteyouroriginalvariable.Youcancheckwhatyouhave

    donewiththesummaryortab1

    commands.Oryoumaywanttocreatesanewvariable,inthiscasefh_scale,definedasfh_clplusfh_pr:

    Generatefh_scale=(fh_cl+fhpr)

    Youcanuseanyformulatostandardizethescale.eg

    generatefh_scale100=100(fh_cl+fh_pr)*7.1

    Togenerateanewbinaryvariableforyourregion,if Africa(coded)

  • 7/27/2019 Introductory Guide to using Stata

    9/16

    Norris IntroductiontoStata 20099

    generateafrica =.

    replace africa =1if(ht_region==4)

    replace africa =0if(ht_region~=4)

    Youcanabbreviatethiscommandwithgen.NotethatStatawilltellyouifanymissingvalueswere

    generatedbyattemptingtoperformacalculationwithmissinginformation.Forexample,ifoneofthe

    observationswasmissinginformationonhours,Statawouldsetyrhrsequaltomissingforthis

    observation.(Seefurthernotesaboutmissingdatalaterinthissection.)

    Onceavariablewithaparticularnamehasbeengeneratedyoucantgenerateanotherwiththesame

    name.Instead,youmustreplacetheoldone.

    LABELING VARIABLESAND VALUES

    Labelingvariablesandvalueshelpsyoukeeptrackofhowyoucodedyourvariablesandwhattheyrepresent.Ittakesjustacoupleofsecondstoaddlabels,anditcansaveyoulotsoftimelaterwhenyou

    cantrememberwhattheacodeof4meansinyourGDPcategoryvariable,forexample,orhowthe

    variabledemo1differsfromdemo2.

    Toattachalabeltoavariableanditsvalues:

    labelvariableafricaWorldregion

    labelvaluesafrica africalabel

    labeldefineafricalabel 0Restoftheworld1SubSaharanAfrica

    USINGFUNCTIONS Functionsarespecialcalculationsusedwithothercommands,suchasgenerateorreplace.Statahasthe

    capabilitytocalculatemanyfunctions.Herearesomeexamplesofthemostcommonlyusedones.

    ln(x)

    Calculatesthenaturallogofx,wherexmaybeaconstantoravariablesuchasmad_gdppc.

    Inacommand,youmightusethelogfunctionlikethis:

    genlogGDP2006=ln(mad_gdppc)

    DELETING VARIABLES AND OBSERVATIONS

    drop

    The drop command can delete either variables or observations. Deleting a variable removes an entire

    variable (column) from the data set, whereas deleting an observation removes an entire observation

    (row) from thedata set.Becarefulwhendoing this thevariablesandobservationsarepermanently

  • 7/27/2019 Introductory Guide to using Stata

    10/16

    Norris IntroductiontoStata 200910

    deletedonceyousavethedatafile!Itisfarbettertoretainthewholedatasetbuttofilterfortheselected

    region.

    Toeliminateavariable,inthiscasemad_gdppc:

    dropmad_gdppcToeliminateobservations,inthiscaseEasternEuropeasaregion:

    dropifht_region==1

    Alternativelyyoucouldjustkeepasubsetofdata:

    keepifafrica==1

    LIST

    Printsallvariablesandobservationstothescreen.Youllprobablyneverwanttodothissinceyourdata

    setswillbetoolarge.

    list

    Youcanprintalimitedsetofvariables:

    list fh_status

    Youcouldalsoprintalimitedsetofobservationsaccordingtoanothercriteria,inthiscaseAfricabeing

    equalto1:

    listcnamefh_statusfh_clfh_prifht_region==3

    codebook

    Providesevenmoreinformation(mean,standarddeviation,range,percentiles,labels,numberofmissing

    values,etc.)aboutavariable:

    codebookfh_statufh_prfh_cl

    ANALYSIS OF CONTINUOUS (SCALE) VARIABLES

    EXAMININGMEANSBYCATEGORY

    Inthiscasethecategoryisht_regionandthemeaniscalculatedforp_polity.Youcandothiswithvarious

    commandseg

    tableht_region,contents(meanp_polity)

    tabstatp_polityfh_statusfh_clfh_pr,by(ht_region)columns(variables)

    meanp_polity,

    over(ht_region)

    CORRELATIONS

    corrfh_clfh_pr p_polity

    Withsignificance(P)printedbelowinstarsforallcoefficientssignificantat.05orabove

    pwcorrfh_cl fh_pr p_polity, star(5)

  • 7/27/2019 Introductory Guide to using Stata

    11/16

    Norris IntroductiontoStata 200911

    ESTIMATING LINEARMODELS (OLS AN D 2STAGELEASTSQUARES)

    regress

    Calculatesanordinaryleastsquares(OLS)regression,inthiscaseforaregressionofthedependent

    p_polityontheindependentsGDPandal_ethnic.Notethatthedependentvariableisthefirst

    variablelisted.

    regressp_polity mad_gdppcal_ethnic

    IfyouwishtoonlyincludeobservationswithAfricaequalto1intheregression:

    regressp_polity mad_gdppcal_ethnicifafrica==1

    Torunaregressionwithrobuststandarderrors:

    regressStable2006GDP2006Africa,robust

    ToruntwostageleastsquareswhereGDPisendogenousandz1"isanexogenousinstrumental

    variable:

    regressp_polity al_ethnic(mad_gdppc z1)

    Note:Ifyourunaregressioncontainingmorethan40variables,Statawillreturnanerrorcodesaying:

    matsizetoosmall

    Toovercomethisproblem,resetthemaximumnumberofvariablesStatawillestimateusingthematsize

    command;thenumbershouldbegreaterthanorequaltothetotalnumberofvariablesintheregression.

    setmatsize150

    predict

    Calculatesthepredictedvalueforeachobservationusingthecoefficientsfromthelastregression

    estimatedandsavestheseasavariablecalledyhat:

    predictyhat

    Tocalculatetheresidualforeachobservationusingthemostrecentlyestimatedregressionmodeland

    savetheseasavariablecalledehat:

    predictehat,residual

    test

    CalculatesanFtestofajointhypothesisconcerningthecoefficientsinthemostrecently

    estimatedlinearregressionmodel,inthiscasewiththenullhypothesisH0: age= sex=0:

    testal_ethnicmad_gdppc

    ESTIMATING NO NLINEAR MODELS (LOGITAN D PROBIT)

    LOGIT

    Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable

    chga_regimeequals1fordemocracyand0forautocracy.

    logitchga_regimeal_ethnicmad_gdppc

  • 7/27/2019 Introductory Guide to using Stata

    12/16

    Norris IntroductiontoStata 200912

    Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand

    savetheseasavariablecalledphat:

    predictphat

    PROBIT

    Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable

    chga_regimeequals1fordemocracyand0forautocracy.Ifyouwishtoestimatetheprobabilityof

    chga_regimeconditionaluponal_ethnicmad_gdppc:

    probitchga_regimeal_ethnicmad_gdppc

    Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand

    savetheseasavariablecalledphat:

    predictphat

    4.MAKING GRAPHS

    Stata8hasaGraphicsmenuthatletsyoucreategraphsfromawindowsmenu,asanalternativetousing

    commandlanguage.TheGraphicsmenuisaparticularlyuserfriendlywayofcreatinggraphs,sincegraphs

    containsomanyoptionsforlabels,axes,etc.TheGraphicsmenuisfairlyintuitivetousesimplypull

    downthemenuandchoosethetypeofgraphyouwant.Theoptionsareselfexplanatory.Forthose

    interestedinusingcommandlanguagetocreategraphs,someofthebasicsarecoveredbelow,andyou

    canreplyonthegraphicsmanualformorecomplicatedcreations.AlsoSPSShasbetterandfarmore

    flexiblegraphics.Youmaywanttoconsiderthisprogramforthisfunctionalone.Youcanalsocutand

    pastetheresultsoftablesintoExcelforflexibleformatsandcontroloverelements.

    HISTOGRAM

    Thisisthedefaultwhenonlyonevariableisspecified:

    histogramp_polity

    Youcanalsodrawanormaldensityoverthehistogram:

    histogramp_polity,normal

    TohaveSTATAgraphonlycertainobservations,inthiscasethoseforwhichafricais1:

    histogramp_polityifafrica==1,bin(30)

    Toaddatitle:

    histogramp_polity,title(PolityIVRatingofLiberalDemocracyinAfrica)

    SCATTERPLOT

    Thisisthedefaultiftwovariablesarespecified:

    scatterwbgi_geewbgi_pse

    Conditions,axes,titles,labelingandreferencelinescanbespecifiedasabove.Forexample:withlabels

    scatterwbgi_geewbgi_pse,t1(Effectivenessbystability)mlabel(cname)

  • 7/27/2019 Introductory Guide to using Stata

    13/16

    Norris IntroductiontoStata 200913

    scatterwbgi_geewbgi_pse,t1(Effectiveness bystability)

    Afterperformingaregression,youmaywanttographpredictedandactualvaluesofthedependent

    variableagainsttheindependentvariable:

    scatter

    yhat1scatterwbgi_geewbgi_pse,xlabelylabelsymbol(o.)

    BAR GRAPHS

    Thisisproducedwithagraphcommandfollowedbyonevariable.Asecondvariableisusedtodefine

    groups.Toproduceagraphwithbarheightsrepresentingthemeanforeachgroup:

    sortwbgi_gee

    graphbar(mean)wbgi_gee,over(ht_region)

    Conditions,yaxisoptions,mosttitles,andhorizontalreferencelinescanbespecifiedasdescribedabove

    withregardtohistogram:

    sortwbgi_gee

    graphbar(mean)wbgi_gee,over(africa)t1(PoliticalStabilityinAfrica)t2(Title2)l1(MeanStability)

    l2(AnotherTitle)yline(33000)

    PRINTING YOUR GRAPH

    Stataallowsyoutoprint(File|PrintGraph)andsave(File|SaveGraph)yourgraphs.Theeasiestwayto

    incorporateyourgraphintoaWorddocumentistocopythegraphtotheclipboardusingEdit|Copy

    Graphandthenpasteitintoyourdocument.Rememberthatallgraphsshouldhaveaclearheadline,to

    illustrateyourreport,withafullnotebelowspecifyingthesourceofthedataandanynotesexplaining

    variables.Allgraphsshouldbeselfcontainedwithoutlookingfurtherinyourreport.

    5.UTILITIES

    VIEWINGTH EDATA

    Onceyouhaveopenedadataset,youmaywishtolookatthevariablesandobservationsinspreadsheet

    format.Stataprovidestwowaystodothis,browseandedit.Thebrowsecommandletsyouseethe

    databutnotmakechanges,whereastheeditcommandallowsyoubothtobrowseandtomakechanges.

    Itisprobablybesttousebrowseunlessyouactuallyintendtomakechangestoyourdatamanually;

    otherwiseyoumayaccidentallychangesomethingandruinyourdata.

    Tobrowse,enterbrowseintotheCommandwindoworselecttheBrowseicon(thirdfromtheright,a

    spreadsheetwithamagnifyingglassonit).Toedit,entereditintotheCommandwindoworselectthe

    Editicon(fourthfromtheright,aspreadsheetwithnomagnifyingglass).

    CREATING AN D SUBMITTING ADOFILE

    AlthoughStatacanberuninteractivelybyjusttypingonecommandatatime,Statacommandscanalso

    besubmittedinbatchesbyusingadofile.AdofileissimplyatextfilewhichcontainsaseriesofStata

    commands.YouentertheStatacommandsinthesameorderasyouwouldentertheminteractively,and

    Statathenrunsthesecommandsautomaticallyinsteadofyourhavingtotypetheminlinebyline.

  • 7/27/2019 Introductory Guide to using Stata

    14/16

    Norris IntroductiontoStata 200914

    Foryourproblemsets,itisstronglyrecommendedthatyouusedofiles.Someoftheproblemsetswill

    requiremanyStatacommands,anditisinevitablethatyouwillneedtomakechangesandrunthese

    seriesofcommandsanumberoftimes.Whenyouhaveallofyourcommandsinasinglefile,itismuch

    easiertogobacktothatfileandmakethenecessarychangesthantohavetoretypeeverycommand.

    CreatingaDo

    file

    Tostartcreatingadofile,clickontheDofileeditorbutton(fifthfromtheright,lookslikeanenvelope

    withapencilonit),choosetheDofileeditoroptionundertheWindowmenu,ortypedoeditinthe

    Commandwindow.NotethatsinceadofileisawrittenlistofcommandsasenteredintheCommand

    window,youcannotusetheStatamenuswithinadofile.Insteadyouneedtousethetyped(Command

    window)commands.

    6.CONVERTING DATAFILES(EXCEL TO STATA)

    TheeasiestwaytoconvertdatafilesistousethesoftwareprogramStatTransfer.Thisprogramisonthe

    labcomputersandallowsyoutoconvertyourdatatoorfromavarietyofdifferentfileformats(Stata,SAS

    Transport,Excel,SPSS,QuatroPro,FoxPro,etc.).

    ToconvertafilefromExceltoStata:

    a)ClickontheapplicationStatTransferintheDataAnalysisfolder.

    b)SelectExcelWorksheetforInputFileType.

    c)UseBrowsetoidentifytheExcelfileyouwanttoconvertfrom.(Ifthefirstrowofthe

    worksheetcontainsthevariablenames,theprogramwillusetheseasthevariablenames.)

    d)SelectStataVersion8astheOutputFileType.(SinceStata8.0isarecentrelease,itis

    possiblethattheversionofStatTransferyoureusingwillnothaveStataVersion8asanoption.If

    thisisthecase,saveitasaversion7file;youshouldstillbeabletoopenthefileinversion8.)

    e)Typeinthepathandnameofthefileyouwishtocreate.f)BegintheconversionbyclickingonBeginTransfer.

    StataalsoallowsyoutoreadinbinaryandASCIIfilesdirectly.However,inmostcasesitiseasier

    tofirstconvertyourdatatoaspreadsheetandthenconvertittoStatausingStatTransfer.

  • 7/27/2019 Introductory Guide to using Stata

    15/16

    Norris IntroductiontoStata 200915

    SUMMARY OF SPSSAND STATA COMMANDS

    SPSSCommand StataCommand(s)

    ADDFILES append

    AGGREGATE collapse

    ANOVA anova

    AUTORECODEdestring

    encode

    CASESTOVARS reshapewide

    COMMENT*

    /**/

    COMPUTE

    generate

    replace

    egen

    CORRELATIONScorrelate

    pwcorr

    CROSSTABStabulate

    tab2

    DATALIST

    infile

    infix

    insheet

    DELETEVARIABLESkeep

    drop

    DESCRIPTIVES summarize

    DISPLAY describeDOCUMENT notes

    DOIF xyzcommandif

    DOREPEAT foreach

    ECHO display

    ERASE erase

    EXAMINEtabulatex,

    summarize(y)

    EXECUTE noequivalent

    EXPORT noequivalent

    FACTOR factor

    FILELABEL labeldata

    FILTER xyzcommandif(___)

    FLIP xpose

    FORMATS format

    FREQUENCIES tabulate

    SPSSCommand StataCommand(s)

    GETFILE use

    GETSAS fdause

    GRAPH graph

    IF generate__if__

    IGRAPH graph

    INCLUDEFILE do___

    LIST list

    LOGISTICREGRESSION logistic

    LOOP forvalues

    MATCHFILES merge

    MEANStabulate__,

    summarize(__)

    MISSINGVALUES none

    MIXED xtmixed

    NOMREG mlogit

    PLUM ologit

    PROBIT probit

    RECODE recode

    RECORDTYPE noequivalent

    REGRESSION regress

    RELIABILITY alpha

    RENAMEVARIABLES rename

    SAMPLE sample

    SAVE save

    SELECTIFkeepif

    dropif

    SORTCASES sort

    SPLITFILE by

    SUMMARIZEtabulate___,

    summarize(___)

    TEMPORARY.

    SELECTIF(___).xyzcommandif(___)

    TTEST ttest

    VALUELABELS

    VARIABLELABELS

  • 7/27/2019 Introductory Guide to using Stata

    16/16

    Norris IntroductiontoStata 200916

    Notes: