advancing the comparability of occupational data through linked open data
TRANSCRIPT
![Page 1: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/1.jpg)
Richard Zijdeman [richard.zijdeman at iisg.nl]Kathrin DentlerRinke Hoekstra
Albert Meroño-Peñuela
Advancing the comparability of occupational data through
Linked Open Data
HISCO workshopHistorical Population Database of Transylvania
Cluj, RomaniaJune 18, 2016
![Page 2: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/2.jpg)
2
... it is market position, and especially position in the occupational division of labour, which is fundamental to the generation of structured inequalities. The life chances of individuals and families are largely determined by their position in the market and occupation is taken to be its central indicator ... .
(Rose and Harrison, 2010)
![Page 3: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/3.jpg)
3
Occupations are important as dependent variables (occupational attainment studies) and independent variables (occupation stratification studies) in educational (and occupational) status attainment, health, voting, consumption, marriage etc.
(Ganzeboom, 2008)
![Page 4: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/4.jpg)
4
Occupations are one of the few indicators of social position that are available in:
• large quantities • different time periods • various societies• at the individual level (smallest level of detail)
![Page 5: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/5.jpg)
5
Lack of comparability
• Many different occupational classifications
• Differences in mobility studies could results from different classification methods (Kaelble 1985)
Charles Booth (1886-1903)
![Page 6: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/6.jpg)
6
HISCO
• Historical International Standard Classification of Occupations
• Put together by a large number of institutes
• Based on ILO’s ISCO ’68
• Occupations retrieved from registers
• 1675 occupational codes
![Page 7: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/7.jpg)
7
Current solution: 2-step procedure
Code into the concept, first:• Classify into the concept (HISCO)• Link the measure of stratification to the concept (e.g. SOCPO,
HISCAM)
![Page 8: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/8.jpg)
8
New problems
1. What concept?• Historical International Standard Classification (HISCO)• OCCHISCO• PST
2. Not all measures link to all concepts• E.g. no link between OCCHISCO and HISCAM
3. Adaptability of concepts (new versions)
![Page 9: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/9.jpg)
9
Is this a substantive problem?
Illustrative example:• Subset of SAME occupational titles from NAPP and HISCO• Link these occupations to HISCAM• For HISCO directly provided by HISCAM people• For OCCHISCO indirectly through a mapping
![Page 10: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/10.jpg)
10
occupations
OCCHISCO
HISCO
HISCAMCross-walk
E.g.: necessary for a comparison between Norway and the Netherlands
![Page 11: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/11.jpg)
11
![Page 12: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/12.jpg)
12
![Page 13: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/13.jpg)
13
So yes, this is problematic
• ‘Lost’ 41% explained variance • Cf. regression models: usually not above 30%• HISCAM often both as dependent and independent variable
![Page 14: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/14.jpg)
14
New problems
1. What concept?• Historical International Standard Classification (HISCO)• OCCHISCO• PST
2. Not all measures link to all concepts• E.g. no link between OCCHISCO and HISCAM
3. Adaptability of concepts (new versions)
![Page 15: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/15.jpg)
15
Towards a solution
• Linked Data (Berners-Lee, 2006)
• Define Resources (books, respondents, etc.) with a URI
• Present URI’s as URL’s
• Describe Resources using so called ’triples’
![Page 16: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/16.jpg)
16
An example of a triple
Margaret Minerworks as
PropertyResource Value
![Page 17: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/17.jpg)
17
Miner
occupation
is of type
Resource
Property
Value
![Page 18: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/18.jpg)
18
Miner
occupation
is of type
Margaret Minerworks as
![Page 19: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/19.jpg)
19
miner
50.56
71105
71120
hasocchisco
has hisco
has hiscam
![Page 20: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/20.jpg)
Occupational title
Source
PST: 123
OCCHISCO: 123
HISCO: 12345
HISCO: 54321
WasDerivedFrom
codedByLeigh
codedByEvan
codedByChris
codedByRichard
HISCAM: 88codedByMappingFile
Provenance
![Page 21: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/21.jpg)
21
HISCO vocabulary
![Page 22: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/22.jpg)
22
• hisco:entry for ‘occupational titles’
• transitivity between category, unit, minor and major group
![Page 23: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/23.jpg)
23
Case study: DBpedia
- Structured data behind Wikipedia
- Information on all kinds of topics, also occupations
- Add HISCO codes to DBpedia occupations
- Let’s try and do this live: http://yasgui.org/short/VJfZvnx6x
![Page 24: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/24.jpg)
24
Caveats
• We did not check the technique on a really big scale (e.g. NAPP data)
• Sharing code remains a collective action problem (but less of a coordination problem)
![Page 25: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/25.jpg)
25
Conclusions
Linked Data
• Enhances comparative occupational research
• Adds visibility of heterogeneity in coding practices
![Page 26: Advancing the comparability of occupational data through Linked Open Data](https://reader035.vdocuments.site/reader035/viewer/2022062306/58727d7b1a28abc7068b59c3/html5/thumbnails/26.jpg)
26
Outlook
• Linkage to texts (occupations in newspapers)
• Linkage to public resources: Wikipedia
• Combine Machine Learning and Linked Data for automated occupational coding