introductory guide to using stata
TRANSCRIPT
-
7/27/2019 Introductory Guide to using Stata
1/16
Norris IntroductiontoStata 20091
STM103Spring2009
INTRODUCTION TO STATA
1.DOWNLOADINGDATA:......................................................................................................................2
Assignment1preparation..........................................................................................................................3
Selectvariables...........................................................................................................................................3
2.STARTINGASTATASESSION..............................................................................................................4
Opening,Saving,andClosingthedatafile.................................................................................................5
KeepingalogrecordofaStataSession......................................................................................................5
UsingOperators..........................................................................................................................................5
3.WORKINGWITHDATA.......................................................................................................................6
Toseewhatyourselectedvariablescontain:............................................................................................6
Tolookatthe%distributionincategoricalvariables................................................................................7
Tolookatatablecombiningtwocategoricalvariables.............................................................................7
CreatingandChangingValuesofVariables................................................................................................8LabelingVariablesandValues....................................................................................................................9
UsingFunctions..........................................................................................................................................9
DeletingVariablesandObservations.........................................................................................................9
list .........................................................................................................................................................10
Analysisofcontinuous(scale)variables...................................................................................................10
Correlations..............................................................................................................................................10
EstimatingLinearModels(OLSand2StageLeastSquares).....................................................................11
EstimatingNonLinearModels(LogitandProbit)....................................................................................11
logit .........................................................................................................................................................11
probit........................................................................................................................................................12
4.MAKINGGRAPHS.............................................................................................................................12
Histogram.................................................................................................................................................12
Scatterplot................................................................................................................................................12
Bargraphs.................................................................................................................................................13
Printingyourgraph...................................................................................................................................13
5.UTILITIES.........................................................................................................................................13
ViewingtheData......................................................................................................................................13
CreatingandSubmittingaDoFile............................................................................................................13
6.CONVERTINGDATAFILES(EXCELTOSTATA)....................................................................................14
Thisprovidesabrief introductiontousingStatafortheQoGdatasetanalysis.Stata isavailableonallof
thecomputers in theKennedySchoolscomputer lab. Ifyouhaveahomecomputeryoumaywant to
purchaseacopyofStatafromtheCMO.StataisavailableforWindows98,Windows2000,WindowsME,
Windows XP, Windows NT, Macintosh, and UNIX operating systems. The Stata Users Guide is also
availablefromtheCMO.
The commands outlined below assume that you are using Stata for Windows. Throughout this text,
anythingappearing inBold font isaStatacommand,whereasanything in red italics isavariablename
whichyoushouldchangeforyourspecificanalysis.Menucommandsareindicatedas,e.g.,File|Open,to
indicatethatyoufirstgototheFilemenuandthenchoosetheOpenoption.TheBl ue Cour i er isthe
typeofoutputyoushouldgenerate. Asashortcut,youcanalsojustcopyandpasteanyofthecommand
linesheredirectlyintoyourStataCommandwindowthenrun.
-
7/27/2019 Introductory Guide to using Stata
2/16
Norris IntroductiontoStata 20092
1.DOWNLOADING DATA:
GototheQoGwebsite:http://www.qog.pol.gu.se/GotoData:theQoGData
2.Downloadthecorrectversionofthedataset.Tostartwork,downloadthecrosssectionaldatasetfor
Stataandsavethissomewhereclearlylabeledonyourmemorystick,harddriveorsharedserverspace.
3.AlsosavethePDFversionoftheFULLCodebooksomewheresafe;itisverylongbutaninvaluable
referencedocument.
-
7/27/2019 Introductory Guide to using Stata
3/16
Norris IntroductiontoStata 20093
ASSIGNMENT 1 PREPARATION
Theaimistowriteaprofessionalreportassessingandcomparingtheproblemsofdemocraticgovernance
reforminoneworldregion.Pickyourregion:
LatinAmerica
and
the
Caribbean,
Africa,
Asia,
CentralandEasternEurope,
MiddleEast
WesternEurope
Thinkaboutthekeyproblemsofdemocraticgovernanceintheregion.Fromyourexperienceandyour
reading,whataretheprioritiesforagencies?Canyourankthem?Focusonthemostimportant23issues
inthefirstinstance.Thenlookcarefullyattheshareddatasetcodebook.Startbyselecting34indicators
whichrelatetotheproblemsyouhavedecidedtofocusupon.Thesharedclassdatasetprovidesthe
followingindicators,alongwithmanyothers:
1.FreedomHouseindexofpoliticalrightsandcivilliberties
2.Polity
IV
Project
Democracy
and
Autocracy
scales
3.CheibubandGandhiDemocracyAutocracyclassification
4.VanhanenDemocracyIndex
5.WorldValuesSurvey/GlobalBarometersAttitudinalsurveys
6.Kaufmann/KrayWorldBankInstituteGoodgovernanceindicators
7.TransparencyInternationalCorruptionindex
SELECT VARIABLES
You can use the wholedataset withoutdoinganything furtherbut there are a LOTofvariables in the
dataset.Tosimplifyyourlifeandmakeitlessconfusing,forthisexerciseyouwillfinditeasiertoidentifya
subsetofkeyvariablesfromtheWhatItIslistwhichyouwanttouse.Youcanalwaysgoback to selectmorevariablesata later stage (nonearedeleted)but firstwork out
whichvariablestoputintoyoursubseti.e.whatisimportantasindicatorsofdemocraticgovernancefor
yourregion. NoteinStataitmatterswhethervariablenamesareincapitalsornot.
Selectthefirstoneslistedbelowandpickabout510toadd.Writedowndetailsinthelistbelowsothat
youhavethishandy.Look inthefullcodebookformoredetailsabouttheconstructionandmeaningof
each.
Name Briefdescription Type
cname Countryname Nominal
ht_region Globalregion:10categories Nominal
chga_regime CheibubandGhandi:Typeofdemocraticorautocraticregime Nominal
fh_status Freedom House: classification of states into free, semifree andnotfree
Ordinal
p_polity Combinedpolityscoreofdemocracyandautocracy Scale 10/+10
fh_cl FreedomHousecivilliberties Scale7pt
fh_pr FreedomHousepoliticalrights Scale7pt
wbgi_vae WorldBank:Voiceandaccountabilityestimate Scale
wbgi_pse WorldBank:Politicalstabilityestimate Scale
wbgi_gee WorldBank:Governmenteffectiveness Scale
-
7/27/2019 Introductory Guide to using Stata
4/16
Norris IntroductiontoStata 20094
rsf_pfi ReporterswithoutBorders:PressFreedomIndex Scale100pts
ti_cpi TransparencyInternational:CorruptionPerceptionIndex Scale100pts
2.STARTING ASTATASESSION
StartStatabylocatingthelinkonthedesktopandclicking.StataSE 8.lnk
Stataisapowerfultoolforconductingstatisticalanalyses.HereiswhatatypicalsessioninStatalookslike.
Thewindowscanbemovedaboutandresizedtosuityourpreferences.Ifyoudonotseeanyoftheseon
yourversion,gotoWindowandaddtheseuntiltheylookroughlyliketheabove.
-
7/27/2019 Introductory Guide to using Stata
5/16
Norris IntroductiontoStata 20095
a.HeretheResultswindowliststheoutcome.
b.TheVariableswindow,ontheleft,liststhenamesofallthevariablesincludedintheshareddataset.
c.Youcanentercommandsintwoways.Tostartlearningtheprogram,youcanusethedropdown
menus,similartothosecommoninMicrosoftprograms.Thisisusefulforbeginners.Onceyoubecomemorefamiliarwiththeprogram,youwillwanttotypeincommandsdirectly,tosavetime,usingthe
Commandwindowatthebottomcenterofthescreen. AcommandtellsStatawhattodoe.g,toopena
file,torunaregression,tocalculateameanofavariable,etc.
d.TheReviewwindowshowsalistofallthecommandsyouhavealreadyrun.(Here,itshowsthatIhave
openedadatafile.)IfyouclickonapreviouslyruncommandintheReviewwindow,itwillappearinthe
Commandwindowandyoucanedititorrunitagain.
OPENING, SAVING,AN DCLOSINGTH E DATA FILE
Youwillthenneedtoopenthedatafileyouhavesaved.Youmayneedtoboostthememoryallocated.
Setmemory
80000
File|Open
File|SaveAs
Itisalsoalwaysusefultohaveabackupcopyofyourdata.Thisway,nomatterwhatyoudotochangeor
recodethevariables,youalwayshaveacopyoftheolderversion.Itisalsousefulpracticetosaveyour
datafileattheendofeachsessionunderanewsequentialversion(egSTM103_2,STM103_3,sothatyou
havetheoldandnewestfileincaseyouneedtorevertback.
File|Exit
Wheneveryoufinish,toexitStata.
KEEPINGA LO GRECORD OF A STATA SESSION
File|log
Tosaveafile(log)ofyourresults,youwillneedtocreatealogfile.Statagivesyoutwochoicesoffile
formatsforyourlogfile,.log(textfile)and.smcl(formattedlogfile).The.smclfileswilllooknicerwhen
printed.YoushouldneverevercutandpasteyourStataoutputdirectlyintoyourreport;alwayssimplify,
cleanandtransferinaprofessionalandcleanformat.
Tostartalogfileinteractively,chooseFile|Log|Begin,selectthedirectoryyouwanttosavethelogfile
in,andgiveitaname(suchasjob1).Alternately,youcanclickonthefourthiconfromtheleftontheicon
bar,whichlookslikeascroll.
File|Log|Close
USINGOPERATORS
Statausesthefollowingarithmeticoperators:
+ add
subtract
* multiply
/ divide
^ raisetothepower
-
7/27/2019 Introductory Guide to using Stata
6/16
Norris IntroductiontoStata 20096
Forrelations,Statauses:
== equal
~= notequal
> greaterthan
= greaterthanorequalto
-
7/27/2019 Introductory Guide to using Stata
7/16
Norris IntroductiontoStata 20097
Firstletssummarizeyourselectedvariables.Type:
sum ht_region p_polity fh_cl fh_pr
Variable | Obs Mean Std. Dev. Min Max-------------+--------------------------------------------------------
ht_region | 192 4.526042 2.644633 1 10
p_polity | 160 1 14.45487 -77 10
fh_cl | 192 3.385417 1.821166 1 7
-------------+--------------------------------------------------------
fh_pr | 192 3.364583 2.156785 1 7
chga_regime | 189 .3968254 .4905386 0 1
Thisisveryusefulforlookingatyourselectedvariablestoseewhattheyarelike,whethernominal,
ordinalorscale(continuous).E.g.chga_regimeisabinary2 categoryvariable.Trythisforacoupleof
yourvariablesandaddnotestoyourselectedvarsonpage2.
summarizecanbeabbreviatedtosum.Youcanalsolookatmoredetail.eg
sumht_region,detail
Youcandothesamejustforoneselectedregion
sump_polityifht_region==1,detail
TO LOOK ATTH E%DISTRIBUTIONINCATEGORICAL VARIABLES
Forcategoricalvariables,trythefollowingwhichgeneratessomesimplefrequenciesielookatthe
numberofcountries(Freq)andthepercentcolumn.
. tab1ht_region
The Region of the Country | Freq. Percent Cum.
----------------------------------------+-----------------------------------
1. Eastern Europe and post Soviet Union | 27 14.06 14.06
2. Latin America | 20 10.42 24.48
3. North Africa & the Middle East | 20 10.42 34.90
4. Sub-Saharan Africa | 48 25.00 59.90
5. Western Europe and North America | 27 14.06 73.96
6. East Asia | 6 3.13 77.08
7. South-East Asia | 11 5.73 82.81
8. South Asia | 8 4.17 86.98
9. The Pacific | 12 6.25 93.23
10. The Caribbean | 13 6.77 100.00
----------------------------------------+-----------------------------------
Total | 192 100.00
TO LOOK ATATABLECOMBINING TW OCATEGORICALVARIABLES
-
7/27/2019 Introductory Guide to using Stata
8/16
Norris IntroductiontoStata 20098
tab2ht_region chga_regime, col
-> tabulation of ht_region by chga_regime
The Region of the | Type of RegimeCountry | 0. Democr 1. Dictat | Total
----------------------+----------------------+----------1. Eastern Europe and | 17 10 | 27| 14.91 13.33 | 14.29
----------------------+----------------------+----------2. Latin America | 17 3 | 20
| 14.91 4.00 | 10.58----------------------+----------------------+----------3. North Africa & the | 3 17 | 20
| 2.63 22.67 | 10.58----------------------+----------------------+----------4. Sub-Saharan Africa | 20 28 | 48
| 17.54 37.33 | 25.40----------------------+----------------------+----------5. Western Europe and | 26 0 | 26
| 22.81 0.00 | 13.76----------------------+----------------------+----------
6. East Asia | 4 2 | 6| 3.51 2.67 | 3.17
----------------------+----------------------+----------7. South-East Asia | 3 7 | 10
| 2.63 9.33 | 5.29----------------------+----------------------+----------
8. South Asia | 3 5 | 8| 2.63 6.67 | 4.23
----------------------+----------------------+----------9. The Pacific | 8 3 | 11
| 7.02 4.00 | 5.82----------------------+----------------------+----------
10. The Caribbean | 13 0 | 13| 11.40 0.00 | 6.88
----------------------+----------------------+----------Total | 114 75 | 189
| 100.00 100.00 | 100.00
CREATING AN D CHANGING VALUESOF VARIABLES
Recodeandgenerate
Thishelpstocategorizethevalueofanexistingscalevariable.Forexample,takethevariablecalled
p_politythatcontainsthe20pointPolityIVscaleforeachnationinyourstudy.Butyouwantthescale
tobecollapsedintotwotypesofregime,democracies andautocracies.Todothis,generateanew
variablecalledregimeandrecodeitasfollows:
genregime=p_polity
recoderegime 10/1=0 0/10=1
sumregime
Whenrecodingyourdata,becarefulnottooverwriteyouroriginalvariable.Youcancheckwhatyouhave
donewiththesummaryortab1
commands.Oryoumaywanttocreatesanewvariable,inthiscasefh_scale,definedasfh_clplusfh_pr:
Generatefh_scale=(fh_cl+fhpr)
Youcanuseanyformulatostandardizethescale.eg
generatefh_scale100=100(fh_cl+fh_pr)*7.1
Togenerateanewbinaryvariableforyourregion,if Africa(coded)
-
7/27/2019 Introductory Guide to using Stata
9/16
Norris IntroductiontoStata 20099
generateafrica =.
replace africa =1if(ht_region==4)
replace africa =0if(ht_region~=4)
Youcanabbreviatethiscommandwithgen.NotethatStatawilltellyouifanymissingvalueswere
generatedbyattemptingtoperformacalculationwithmissinginformation.Forexample,ifoneofthe
observationswasmissinginformationonhours,Statawouldsetyrhrsequaltomissingforthis
observation.(Seefurthernotesaboutmissingdatalaterinthissection.)
Onceavariablewithaparticularnamehasbeengeneratedyoucantgenerateanotherwiththesame
name.Instead,youmustreplacetheoldone.
LABELING VARIABLESAND VALUES
Labelingvariablesandvalueshelpsyoukeeptrackofhowyoucodedyourvariablesandwhattheyrepresent.Ittakesjustacoupleofsecondstoaddlabels,anditcansaveyoulotsoftimelaterwhenyou
cantrememberwhattheacodeof4meansinyourGDPcategoryvariable,forexample,orhowthe
variabledemo1differsfromdemo2.
Toattachalabeltoavariableanditsvalues:
labelvariableafricaWorldregion
labelvaluesafrica africalabel
labeldefineafricalabel 0Restoftheworld1SubSaharanAfrica
USINGFUNCTIONS Functionsarespecialcalculationsusedwithothercommands,suchasgenerateorreplace.Statahasthe
capabilitytocalculatemanyfunctions.Herearesomeexamplesofthemostcommonlyusedones.
ln(x)
Calculatesthenaturallogofx,wherexmaybeaconstantoravariablesuchasmad_gdppc.
Inacommand,youmightusethelogfunctionlikethis:
genlogGDP2006=ln(mad_gdppc)
DELETING VARIABLES AND OBSERVATIONS
drop
The drop command can delete either variables or observations. Deleting a variable removes an entire
variable (column) from the data set, whereas deleting an observation removes an entire observation
(row) from thedata set.Becarefulwhendoing this thevariablesandobservationsarepermanently
-
7/27/2019 Introductory Guide to using Stata
10/16
Norris IntroductiontoStata 200910
deletedonceyousavethedatafile!Itisfarbettertoretainthewholedatasetbuttofilterfortheselected
region.
Toeliminateavariable,inthiscasemad_gdppc:
dropmad_gdppcToeliminateobservations,inthiscaseEasternEuropeasaregion:
dropifht_region==1
Alternativelyyoucouldjustkeepasubsetofdata:
keepifafrica==1
LIST
Printsallvariablesandobservationstothescreen.Youllprobablyneverwanttodothissinceyourdata
setswillbetoolarge.
list
Youcanprintalimitedsetofvariables:
list fh_status
Youcouldalsoprintalimitedsetofobservationsaccordingtoanothercriteria,inthiscaseAfricabeing
equalto1:
listcnamefh_statusfh_clfh_prifht_region==3
codebook
Providesevenmoreinformation(mean,standarddeviation,range,percentiles,labels,numberofmissing
values,etc.)aboutavariable:
codebookfh_statufh_prfh_cl
ANALYSIS OF CONTINUOUS (SCALE) VARIABLES
EXAMININGMEANSBYCATEGORY
Inthiscasethecategoryisht_regionandthemeaniscalculatedforp_polity.Youcandothiswithvarious
commandseg
tableht_region,contents(meanp_polity)
tabstatp_polityfh_statusfh_clfh_pr,by(ht_region)columns(variables)
meanp_polity,
over(ht_region)
CORRELATIONS
corrfh_clfh_pr p_polity
Withsignificance(P)printedbelowinstarsforallcoefficientssignificantat.05orabove
pwcorrfh_cl fh_pr p_polity, star(5)
-
7/27/2019 Introductory Guide to using Stata
11/16
Norris IntroductiontoStata 200911
ESTIMATING LINEARMODELS (OLS AN D 2STAGELEASTSQUARES)
regress
Calculatesanordinaryleastsquares(OLS)regression,inthiscaseforaregressionofthedependent
p_polityontheindependentsGDPandal_ethnic.Notethatthedependentvariableisthefirst
variablelisted.
regressp_polity mad_gdppcal_ethnic
IfyouwishtoonlyincludeobservationswithAfricaequalto1intheregression:
regressp_polity mad_gdppcal_ethnicifafrica==1
Torunaregressionwithrobuststandarderrors:
regressStable2006GDP2006Africa,robust
ToruntwostageleastsquareswhereGDPisendogenousandz1"isanexogenousinstrumental
variable:
regressp_polity al_ethnic(mad_gdppc z1)
Note:Ifyourunaregressioncontainingmorethan40variables,Statawillreturnanerrorcodesaying:
matsizetoosmall
Toovercomethisproblem,resetthemaximumnumberofvariablesStatawillestimateusingthematsize
command;thenumbershouldbegreaterthanorequaltothetotalnumberofvariablesintheregression.
setmatsize150
predict
Calculatesthepredictedvalueforeachobservationusingthecoefficientsfromthelastregression
estimatedandsavestheseasavariablecalledyhat:
predictyhat
Tocalculatetheresidualforeachobservationusingthemostrecentlyestimatedregressionmodeland
savetheseasavariablecalledehat:
predictehat,residual
test
CalculatesanFtestofajointhypothesisconcerningthecoefficientsinthemostrecently
estimatedlinearregressionmodel,inthiscasewiththenullhypothesisH0: age= sex=0:
testal_ethnicmad_gdppc
ESTIMATING NO NLINEAR MODELS (LOGITAN D PROBIT)
LOGIT
Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable
chga_regimeequals1fordemocracyand0forautocracy.
logitchga_regimeal_ethnicmad_gdppc
-
7/27/2019 Introductory Guide to using Stata
12/16
Norris IntroductiontoStata 200912
Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand
savetheseasavariablecalledphat:
predictphat
PROBIT
Estimatesamodelsuitableforadichotomousdependentvariable.Inthiscase,thevariable
chga_regimeequals1fordemocracyand0forautocracy.Ifyouwishtoestimatetheprobabilityof
chga_regimeconditionaluponal_ethnicmad_gdppc:
probitchga_regimeal_ethnicmad_gdppc
Ifyouwishtofindapredictedprobabilityforeachobservationbasedonthemostrecentmodelrunand
savetheseasavariablecalledphat:
predictphat
4.MAKING GRAPHS
Stata8hasaGraphicsmenuthatletsyoucreategraphsfromawindowsmenu,asanalternativetousing
commandlanguage.TheGraphicsmenuisaparticularlyuserfriendlywayofcreatinggraphs,sincegraphs
containsomanyoptionsforlabels,axes,etc.TheGraphicsmenuisfairlyintuitivetousesimplypull
downthemenuandchoosethetypeofgraphyouwant.Theoptionsareselfexplanatory.Forthose
interestedinusingcommandlanguagetocreategraphs,someofthebasicsarecoveredbelow,andyou
canreplyonthegraphicsmanualformorecomplicatedcreations.AlsoSPSShasbetterandfarmore
flexiblegraphics.Youmaywanttoconsiderthisprogramforthisfunctionalone.Youcanalsocutand
pastetheresultsoftablesintoExcelforflexibleformatsandcontroloverelements.
HISTOGRAM
Thisisthedefaultwhenonlyonevariableisspecified:
histogramp_polity
Youcanalsodrawanormaldensityoverthehistogram:
histogramp_polity,normal
TohaveSTATAgraphonlycertainobservations,inthiscasethoseforwhichafricais1:
histogramp_polityifafrica==1,bin(30)
Toaddatitle:
histogramp_polity,title(PolityIVRatingofLiberalDemocracyinAfrica)
SCATTERPLOT
Thisisthedefaultiftwovariablesarespecified:
scatterwbgi_geewbgi_pse
Conditions,axes,titles,labelingandreferencelinescanbespecifiedasabove.Forexample:withlabels
scatterwbgi_geewbgi_pse,t1(Effectivenessbystability)mlabel(cname)
-
7/27/2019 Introductory Guide to using Stata
13/16
Norris IntroductiontoStata 200913
scatterwbgi_geewbgi_pse,t1(Effectiveness bystability)
Afterperformingaregression,youmaywanttographpredictedandactualvaluesofthedependent
variableagainsttheindependentvariable:
scatter
yhat1scatterwbgi_geewbgi_pse,xlabelylabelsymbol(o.)
BAR GRAPHS
Thisisproducedwithagraphcommandfollowedbyonevariable.Asecondvariableisusedtodefine
groups.Toproduceagraphwithbarheightsrepresentingthemeanforeachgroup:
sortwbgi_gee
graphbar(mean)wbgi_gee,over(ht_region)
Conditions,yaxisoptions,mosttitles,andhorizontalreferencelinescanbespecifiedasdescribedabove
withregardtohistogram:
sortwbgi_gee
graphbar(mean)wbgi_gee,over(africa)t1(PoliticalStabilityinAfrica)t2(Title2)l1(MeanStability)
l2(AnotherTitle)yline(33000)
PRINTING YOUR GRAPH
Stataallowsyoutoprint(File|PrintGraph)andsave(File|SaveGraph)yourgraphs.Theeasiestwayto
incorporateyourgraphintoaWorddocumentistocopythegraphtotheclipboardusingEdit|Copy
Graphandthenpasteitintoyourdocument.Rememberthatallgraphsshouldhaveaclearheadline,to
illustrateyourreport,withafullnotebelowspecifyingthesourceofthedataandanynotesexplaining
variables.Allgraphsshouldbeselfcontainedwithoutlookingfurtherinyourreport.
5.UTILITIES
VIEWINGTH EDATA
Onceyouhaveopenedadataset,youmaywishtolookatthevariablesandobservationsinspreadsheet
format.Stataprovidestwowaystodothis,browseandedit.Thebrowsecommandletsyouseethe
databutnotmakechanges,whereastheeditcommandallowsyoubothtobrowseandtomakechanges.
Itisprobablybesttousebrowseunlessyouactuallyintendtomakechangestoyourdatamanually;
otherwiseyoumayaccidentallychangesomethingandruinyourdata.
Tobrowse,enterbrowseintotheCommandwindoworselecttheBrowseicon(thirdfromtheright,a
spreadsheetwithamagnifyingglassonit).Toedit,entereditintotheCommandwindoworselectthe
Editicon(fourthfromtheright,aspreadsheetwithnomagnifyingglass).
CREATING AN D SUBMITTING ADOFILE
AlthoughStatacanberuninteractivelybyjusttypingonecommandatatime,Statacommandscanalso
besubmittedinbatchesbyusingadofile.AdofileissimplyatextfilewhichcontainsaseriesofStata
commands.YouentertheStatacommandsinthesameorderasyouwouldentertheminteractively,and
Statathenrunsthesecommandsautomaticallyinsteadofyourhavingtotypetheminlinebyline.
-
7/27/2019 Introductory Guide to using Stata
14/16
Norris IntroductiontoStata 200914
Foryourproblemsets,itisstronglyrecommendedthatyouusedofiles.Someoftheproblemsetswill
requiremanyStatacommands,anditisinevitablethatyouwillneedtomakechangesandrunthese
seriesofcommandsanumberoftimes.Whenyouhaveallofyourcommandsinasinglefile,itismuch
easiertogobacktothatfileandmakethenecessarychangesthantohavetoretypeeverycommand.
CreatingaDo
file
Tostartcreatingadofile,clickontheDofileeditorbutton(fifthfromtheright,lookslikeanenvelope
withapencilonit),choosetheDofileeditoroptionundertheWindowmenu,ortypedoeditinthe
Commandwindow.NotethatsinceadofileisawrittenlistofcommandsasenteredintheCommand
window,youcannotusetheStatamenuswithinadofile.Insteadyouneedtousethetyped(Command
window)commands.
6.CONVERTING DATAFILES(EXCEL TO STATA)
TheeasiestwaytoconvertdatafilesistousethesoftwareprogramStatTransfer.Thisprogramisonthe
labcomputersandallowsyoutoconvertyourdatatoorfromavarietyofdifferentfileformats(Stata,SAS
Transport,Excel,SPSS,QuatroPro,FoxPro,etc.).
ToconvertafilefromExceltoStata:
a)ClickontheapplicationStatTransferintheDataAnalysisfolder.
b)SelectExcelWorksheetforInputFileType.
c)UseBrowsetoidentifytheExcelfileyouwanttoconvertfrom.(Ifthefirstrowofthe
worksheetcontainsthevariablenames,theprogramwillusetheseasthevariablenames.)
d)SelectStataVersion8astheOutputFileType.(SinceStata8.0isarecentrelease,itis
possiblethattheversionofStatTransferyoureusingwillnothaveStataVersion8asanoption.If
thisisthecase,saveitasaversion7file;youshouldstillbeabletoopenthefileinversion8.)
e)Typeinthepathandnameofthefileyouwishtocreate.f)BegintheconversionbyclickingonBeginTransfer.
StataalsoallowsyoutoreadinbinaryandASCIIfilesdirectly.However,inmostcasesitiseasier
tofirstconvertyourdatatoaspreadsheetandthenconvertittoStatausingStatTransfer.
-
7/27/2019 Introductory Guide to using Stata
15/16
Norris IntroductiontoStata 200915
SUMMARY OF SPSSAND STATA COMMANDS
SPSSCommand StataCommand(s)
ADDFILES append
AGGREGATE collapse
ANOVA anova
AUTORECODEdestring
encode
CASESTOVARS reshapewide
COMMENT*
/**/
COMPUTE
generate
replace
egen
CORRELATIONScorrelate
pwcorr
CROSSTABStabulate
tab2
DATALIST
infile
infix
insheet
DELETEVARIABLESkeep
drop
DESCRIPTIVES summarize
DISPLAY describeDOCUMENT notes
DOIF xyzcommandif
DOREPEAT foreach
ECHO display
ERASE erase
EXAMINEtabulatex,
summarize(y)
EXECUTE noequivalent
EXPORT noequivalent
FACTOR factor
FILELABEL labeldata
FILTER xyzcommandif(___)
FLIP xpose
FORMATS format
FREQUENCIES tabulate
SPSSCommand StataCommand(s)
GETFILE use
GETSAS fdause
GRAPH graph
IF generate__if__
IGRAPH graph
INCLUDEFILE do___
LIST list
LOGISTICREGRESSION logistic
LOOP forvalues
MATCHFILES merge
MEANStabulate__,
summarize(__)
MISSINGVALUES none
MIXED xtmixed
NOMREG mlogit
PLUM ologit
PROBIT probit
RECODE recode
RECORDTYPE noequivalent
REGRESSION regress
RELIABILITY alpha
RENAMEVARIABLES rename
SAMPLE sample
SAVE save
SELECTIFkeepif
dropif
SORTCASES sort
SPLITFILE by
SUMMARIZEtabulate___,
summarize(___)
TEMPORARY.
SELECTIF(___).xyzcommandif(___)
TTEST ttest
VALUELABELS
VARIABLELABELS
-
7/27/2019 Introductory Guide to using Stata
16/16
Norris IntroductiontoStata 200916
Notes: