anne balz, klaus pforr, florian thirolf · 2019-06-10 · stata export for metadata documentation...
TRANSCRIPT
![Page 1: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/1.jpg)
Stata export for metadata
documentation
Munich, 26.05.2019Anne Balz, Klaus Pforr, Florian Thirolf
![Page 2: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/2.jpg)
Motivation
� German Microdata Lab (GML) offers Metadata for
various official microdata online
� Goal: extract Metadata from these Datasets automatically
and import them into our database
� German Microcensus
� European Labour Force Survey
� EU-SILC (European Union Statistics on Income and
Living Conditions)
2
![Page 3: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/3.jpg)
Microdata-Informationsystem MISSY
3
� Online plattform („MISSY-web“)
� Documentation of official microdata (European &
national)
� Documentation on different levels:
� study
� question
� variable
![Page 4: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/4.jpg)
Microdata-Informationsystem MISSY
4
![Page 5: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/5.jpg)
Microdata-Informationsystem MISSY
5
![Page 6: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/6.jpg)
ado dta2mdcore functionality
![Page 7: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/7.jpg)
core functionality
7
*.dta
output.*
![Page 8: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/8.jpg)
core functionality
8
*.dta dta2meta.ado meta.dta
output.*
meta2*.ado
![Page 9: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/9.jpg)
ado dta2mdado dta2md
![Page 10: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/10.jpg)
ado dta2md
10
*.dta dta2meta.ado meta.dta meta2*.ado
output.*
![Page 11: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/11.jpg)
the meta-file
All necessary (meta-)information in a table format:
� Variable level
� Varname, -label
� Summary statistics (min, max, mean, std)
� Value level
� Value, - label
� Frequencies and percentages
� Overall
� For groups (e.g.: countries)
11
![Page 12: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/12.jpg)
ado dta2md
12
Value Level
User Input (Variable): Group-Variable & Computed
Technical: First Value within Variable
Variable Level
![Page 13: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/13.jpg)
the meta-file
13
…
![Page 14: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/14.jpg)
the meta-file
14
…
![Page 15: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/15.jpg)
ado dta2md
15
![Page 16: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/16.jpg)
ado dta2md
dta2md input(filename) output(filename) //
freqvarlist(varlist) //
[group(varname) //
missingdef(string) smissingdef(string) //
replace ]
dta2md input($path/micro_file.dta) output($path/meta_file.dta)//
freqvarlist(var1 var2 var3) //
group(country) //
missing("X<0") //
smissingdef(`"X="invalid answer"| X="did not understand""') //
replace
16
![Page 17: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/17.jpg)
ado dta2md
17
Loop over all vars
If group specified:
Loop over all groups
(within levels of vars)
If computed:
Loop over all levels
(within all vars)
If group specified:
Loop over all groups
![Page 18: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/18.jpg)
ado dta2mdado meta2DDI
![Page 19: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/19.jpg)
ado meta2DDI
19
*.dta dta2Meta.ado meta.dta meta2DDI.ado
DDI2.5.xml
![Page 20: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/20.jpg)
ado meta2DDI
� Uses the ‚file‘ command
� ‚forvalues‘ to runthrough all categories
� variables of the meta-file are used to form hierarchical output
20
� example:
� ‚first‘ (0/1) tags first category of a variable
� used to generate output on variable level
![Page 21: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/21.jpg)
ado meta2DDI
21
![Page 22: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/22.jpg)
ado meta2DDI
22
![Page 23: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/23.jpg)
ado dta2mdusecase MISSY
![Page 24: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/24.jpg)
Usecase MISSY
24
*.dta dta2Meta.ado meta.dta meta2sql.ado
getUUIDs
generateUUIDs
mapRelations
Database
output.sql
![Page 25: Anne Balz, Klaus Pforr, Florian Thirolf · 2019-06-10 · Stata export for metadata documentation Munich, 26.05.2019 Anne Balz, Klaus Pforr, Florian Thirolf](https://reader030.vdocuments.site/reader030/viewer/2022040404/5e92f20cc6e78b37184cfd1a/html5/thumbnails/25.jpg)
meta2sql.ado
� ‚file‘ command is used
� different frame
� ‚forvalues‘ for each database-table
25