collecting and integrating administrative microdata on ......a. wallgren and b. wallgren,...
TRANSCRIPT
Objective: building a longitudinal micro-database on Italian formal educational paths and qualifications, using administrative microdata Useful to: update the Educational Attainment of Population and support socio-economic statistics production
From 2011 onwards the BIT is fed by about with information on: educational attainment in the reference year (ISCED 2011 - first digit); information on the course of study attended (type of course, school/university); information on the qualifications acquired from 2011 onwards (attainment date, school/university, type of qualification coded in detail).
E.g. extraction of individual_code 52881082 from Table 1 and Table 2
• Accurate analysis of the available administrative sources through close (mainly Ministry for Education, University and Research)
to evaluate data availability and usability to acquire fundamental information on data and metadata
to define actions to improve data quality and data availability from data holders
of statistically relevant and usable administrative datasets
• Loading, storage and integration of the administrative microdata in the (SIM)
to standardize and streamline processes to comply with legislation on data privacy
• Contextual assignment to each statistical unit (individuals, economic units, places) of the (SIM- ID code), unique across administrative datasets and across time
Phase 2:
A specific database on Education and Qualifications named BIT (‘Base integrata su Istruzione e Titoli di studio’) integrates all relevant educational
datasets. These administrative datasets record individual enrolments and qualifications and provide microdata from Primary education to PhD courses.
• STRUCTURE: Two tables: Table 1 – Partitioned per years, contains the individual status in education in each year Table 2 – Contains all information recorded in the administrative datasets about the qualifications acquired • POPULATION: Microdata refers to people who are in a formal educational path in Italy, from 2011 onwards
The acquisition of administrative data for the construction of the BIT is a complex process that evolves over time making possible the improvement of data quality especially for the data completeness. The presence of the SIM codes for integration and the overall good quality already allow to support statistical processes and to develop longitudinal studies. The first results are promising in view of the construction of the Education Register.
INPUT DATA QUALITY
• Availability is improving over time
• Punctuality to be improved ---> BIT timeliness 16 months • Not significant number of missing values for relevant variables in
each dataset • Good Integrability, not relevant missing values for identification
variables in each datasets
BIT DATA QUALITY - Coverage
Enrolled in 2013-14 Coverage
0 500.000 1.000.000 1.500.000 2.000.000 2.500.000 3.000.000
PhD
Master's degree
A.F.A.M. second level
Bachelor's degree
A.F.A.M. first level
Short degree (o.p.)
Academic Diploma (o.p.)
ITS
IFTS
Upper secondary edu.
IFP
Lower secondary edu.
Primary school
Benchmark macro data BIT
Algorithms are implemented to derive from administrative data the statistical variable
Educational Attainment consistent with the first digit of ISCED 2011
The level of education, as ordinal variable, is used both to
recover missing data and to check the consistency over time
The has been implemented in
the BIT. University qualifications have been coded in smallest detail by automated record linkage
procedure and residual clerical review
Sources Year of
availability
Results for Primary and Secondary School Qualifications 2013
National Register of University Students (ANS) 2013
National Register of Pupils 2014
Bolzano Register of Pupils 2015
PhD degrees and PhD enrolled students (I° year) 2016
Enrolled and Diplomas of Higher Technical education (ITS) 2016
Benchmark macro data
BIT
2.799.553 2.803.761
1.743.587 1.729.879
328.174 0
2.647.057 2.670.734
2.496 0
4.163 3.985
5.190 0
78 155
38.784 0
1.022.273 1.109.480
12.921 0
651.942 674.679
33.507 11.275
Tot. 9.289.625 9.003.948
Table 1
YEAR INDIVIDUAL_ CODE
EDUCATIONAL_ATTAINMENT
QUALIFICATION_ VALIDITY_DATE
STATUS_ CODE
INSTITUTE_ CODE
COURSE_ YEAR
CONSI- STENCY
COURSE_CODE
CLASS_ CODE
STATUS_SOURCES STATUS_EVENT_DESCRIPTION … … …
2011 52881082 07 01/08/2011 3 PDPS01301A 5 0 Pupils_Register2010/2011 Upper secondary school -Frequency … … …
2012 52881082 07 01/01/2012 MT 19 1 0 134462 2027 Univ_Enrollements2011/2012 Bachelor’s-Admission … … …
2013 52881082 07 01/01/2013 MT 19 2 0 134462 2027 Univ_Enrollements2012/2013 Bachelor’s-Annual enrollment … … …
2014 52881082 13 25/09/2014 MT 19 3 0 134462 2027 Univ_Enrollements2013/2014 Bachelor’s-Annual enrollment … … …
2015 52881082 13 01/01/2015 MS 19 1 0 155523 3072 Univ_Enrollements2013/2015 Master's-Admission
Table 2
INDIVIDUAL_CODE
EDU_ ATTAINMENT
QUALIFICATION_DATE
QUALIFICATION_ SOURCE
INSTITUTE_ CODE
INSTITUTE_ DESCRIPTION
QAL_ CODE
QUAL_ DESCRIPTION
QUAL_ VOTE
UNI_ COURSE_CODE
UNI_ CLASS_ CODE
LEVEL4_ DESCRIPTION
LEVEL4_ CODE
LEVEL3_ DESCRIPTION
… LEVEL2_ DESCRIPTION
… LEVEL1_ DESCRIPTION
LEVEL1_ CODE
52881082 07 01/08/2011 SCHOOL_RESULTS 2010/2011
PDPS01301A
SCIENTIFIC HIGH SCHOOL
PS00 SCIENTIFIC HIGH SCHOOL DIPLOMA
97/100
Diploma of upper secondary school specializing in scientific studies
40501
Diploma of upper secondary school specializing in scientific studies
…
Diploma of upper secondary school specializing in scientific studies
…
Diploma of upper secondary education (4-5-years)
40000
52881082 13 25/09/2014 UNIVERSITY_ GRADUATED2014
19 UNIVERSITY OF PADOVA
MT Bachelor’s degree – CHEMISTRY
98/110 134462 2027 Chemistry 71002001 Chemistry; industrial chemistry
… Pharmaceutical-chemical field
… Bachelor's degree 71000000
EDUCATIONAL_ATTAINMENT ‘07’= Diploma of upper secondary education EDUCATIONAL_ATTAINMENT ‘13’= Bachelor’s degree
The linking procedures have coded university qualifications in the most
detailed level of Census Classification in 94% of cases.
Collecting and integrating administrative microdata on Education and Qualifications
Francesca Cuppone ([email protected]), Grazia Di Bella ([email protected]), Maria Carla Runci ([email protected]) NTTS 2017
Phase 1:
METHODS
RESULTS
CONCLUSIONS
Acknowledgments The authors would like to thank the Statistical Office of the Ministry for Education, University and Research for the fruitful cooperation.
References P.J.H. Daas, et al. Reports on methods preferred for the quality indicators of administrative data sources, Deliverable 4.2 of Workpackage 4 of the BLUE-ETS project (2011). M. C. Runci, G. Di Bella and F. Cuppone, Integrated Education Microdata to Support Statistics Production, in Data Science and Social Research - Epistemology, Methods, Technology and Applications, Springer series: Studies in Classification, Data Analysis, and Knowledge Organization (2017) (in press). A. Wallgren and B. Wallgren, Register-Based Statistics. Administrative Data for Statistical Purposes, Wiley (2007).
The BIT can update the Population Educational Attainment
at a reference time providing microdata on individuals acquiring new qualifications. Each year, nearly 2 million people increase
their level of education.
The BIT is currently built for the years
2011 to 2014 (2015 in progress).
It has already been
used for data dissemination of 2014 Enterprises’ employment by highest level of
education.
Automated procedure to
apply the Quality Report
Card to the input datasets is under
construction Vocational training
qualifications managed by the Regions (IFP)
represents the largest undercoverage. Academic
degrees of artistic and musical advanced training
(AFAM) I and II level follow.