2010 09 opm_tutorial_01-jun-usecase-datagovuk
DESCRIPTION
Provenance use cases from the data.gov.uk project. Part of the OPM tutorial for FIS'2010@Berlin.TRANSCRIPT
![Page 1: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/1.jpg)
Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk
Jun ZhaoUniversity of Oxford
![Page 2: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/2.jpg)
Outline
• Background about data.gov.uk• The use cases– XML serialization– Data transformation on the fly– Complex and nested processes
![Page 3: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/3.jpg)
data.gov.uk
• Linking UK government data• Aims:– Provide a set of best practices for government
agencies– Provide the minimum set of tooling and
specification to facilitate the publication of data– Encourage “responsible” data publishing
![Page 4: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/4.jpg)
XML -> RDF
XSLT ProcessorXSLT Processor
XSLT ParameterBinding
XSLT ParameterBinding
XSLT StylesheetXSLT Stylesheet
XSLT TemplateXSLT Template
input outputRDF FileRDF File
Who, when, which version,
how
Who, when, which version,
how
Contributed by Jeni Tennison
![Page 5: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/5.jpg)
XSLT ProcessorXSLT Processorinput output
RDF FileRDF FileXSLT ParameterBinding
XSLT ParameterBinding
XSLT StylesheetXSLT Stylesheet
XSLT TemplateXSLT Template
Downloaded from;Unzipped from, etc Made accessible
Who, when, which version,
how
Who, when, which version,
how
Contributed by Jeni Tennison
![Page 6: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/6.jpg)
On-the-fly Transformation
Data transformation
wrapper
Data transformation
wrapper
http://mytransportatio.db/j10
Who, when, which
version, how
Who, when, which
version, how
Contributed by Stuart Williams
![Page 7: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/7.jpg)
Complex Data Creation Pipeline
GATE PipelineGATE Pipeline
GateXMLRegressionTransformationGateXMLRegressionTransformation
GateXMLRdfaTransformationGateXMLRdfaTransformation
RdfaRdfXmlTransformationRdfaRdfXmlTransformation
Courtesy of Paul Appleby from TSO (Data Enrichment Service)
![Page 8: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/8.jpg)
Complex Data Creation Pipeline
GATE PipelineGATE Pipeline
GateXMLRegressionTransformationGateXMLRegressionTransformation
GateXMLRdfaTransformationGateXMLRdfaTransformation
RdfaRdfXmlTransformationRdfaRdfXmlTransformation
Document Reset PRDocument Reset PR
ANNIE English Tokeniser
ANNIE English Tokeniser
ANNIE English SplitterANNIE English Splitter
ANNIE POS TaggerANNIE POS Tagger
Data.gov.uk Morphological Analyzer
Data.gov.uk Morphological Analyzer
Data.gov.uk Flexible Roof Gazetteer
Data.gov.uk Flexible Roof Gazetteer
Data.gov.uk Generic Gazeteer
Data.gov.uk Generic Gazeteer
GATE Noun Phrase Chunker
GATE Noun Phrase Chunker
Data.gov.uk Generic Transducer
Data.gov.uk Generic Transducer
TSO CoreferenceTSO CoreferenceCourtesy of Paul Appleby from TSO (Data Enrichment Service)
![Page 9: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/9.jpg)
wasGeneratedBy wasGeneratedBy wasGeneratedBy
hasParentProcess iterationOfProcess
Level 1: Provenance of execution at higher level
Level 0: Provenance of execution at detailed level
Services used by executions
Artifacts
followed
wasDerivedFrom A data collection
wasTriggeredBy wasTriggeredByaccessedService
![Page 10: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/10.jpg)
Non-digital Data Objects
• Organizations– Organizational structure changes over time– Origin organization, resulting Organization
• Boundary• Legislation
An organization ontology: http://www.epimorphics.com/public/vocabulary/org.html
![Page 11: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/11.jpg)
The Challenges
• Data of different representations, of physical forms, of granularity
• Not tooling support• Provenance across different types of systems– Identification– Different terminologies
![Page 12: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/12.jpg)
The Gaps
• A vocabulary being able to describe provenance of all types of data, from different systems
• A vocabulary still providing enough terms to describe provenance accurately
![Page 13: 2010 09 opm_tutorial_01-jun-usecase-datagovuk](https://reader035.vdocuments.site/reader035/viewer/2022062707/5582f00bd8b42a38168b49b5/html5/thumbnails/13.jpg)
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License
(http://creativecommons.org/licenses/by-sa/3.0/)