datagraft: data-as-a-service for open data

24
DataGraft Data-as-a-Service for Open Data Opportunities for Publishing Property Data Dumitru Roman [email protected] https://datagraft.net

Upload: dapaasproject

Post on 15-Feb-2017

172 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: DataGraft: Data-as-a-Service for Open Data

DataGraftData-as-a-Service for Open Data

Opportunities for Publishing Property Data

Dumitru [email protected]

https://datagraft.net

Page 2: DataGraft: Data-as-a-Service for Open Data

Outline

• What is DataGraft

• DataGraft in SmartOpenData– TRAGSA and ARPA Data Publishing

• DataGraft for Property Data

2

Page 3: DataGraft: Data-as-a-Service for Open Data

Developed to allow

data workers to manage their data in a

simple, effective, and efficient way

Powerful

data transformation and

reliable data access capabilities

3

Page 4: DataGraft: Data-as-a-Service for Open Data

Data Transformation and RDF Publication Process

• Interactive design of transformations?

• Repeatable transformations?

• Reuse/share transformations (user-based access)?

• Cloud-based deployment of transformations?

• Self-serviced process?

• Data and Transformation as-a-Service? 4

Page 5: DataGraft: Data-as-a-Service for Open Data

Tabular Data

GraphData

DataGraft: Data-as-a-ServiceFor the Data Transformation and RDF Publication Process

5

Page 6: DataGraft: Data-as-a-Service for Open Data

DataGraft key feature: Flexible management and sharing of data

and transformations

Fork, reuse and extend transformations built by other professionals from DataGraft’s

transformations catalog

Interactively build, modify and share data

transformations

Share transformations privately or publicly

Reuse transformations to repeatably clean and

transform spreadsheet data

Programmatically access transformations and the transformation catalogue

6

Page 7: DataGraft: Data-as-a-Service for Open Data

DataGraft key feature: Reliable data hosting and querying services

Host data on DataGraft’s reliable,

cloud-based triplestore

Share data privately or publicly

Query data through your own SPARQL

endpoint

Programmatically access the data

catalogue

7

Page 8: DataGraft: Data-as-a-Service for Open Data

8

Page 9: DataGraft: Data-as-a-Service for Open Data

9

Page 10: DataGraft: Data-as-a-Service for Open Data

10

Page 11: DataGraft: Data-as-a-Service for Open Data

11

Page 12: DataGraft: Data-as-a-Service for Open Data

12

Page 13: DataGraft: Data-as-a-Service for Open Data

13

Page 14: DataGraft: Data-as-a-Service for Open Data

14

Page 15: DataGraft: Data-as-a-Service for Open Data

APIs

15

Page 16: DataGraft: Data-as-a-Service for Open Data

DataGraft Enablers

Grafter Grafterizer

RDF DBaaSData Portal

DataGraft

16

Page 17: DataGraft: Data-as-a-Service for Open Data

DataGraft in SmOD: Use Cases

TRAGSA Pilot

• Number of transformations: 42

– Created via reuse: 25

• Number of triples:

– ~ 7.7M

ARPA Pilot

• Number of transformations: 5

– Created via reuse: 2

• Number of triples:

– ~ 14K

17

Page 18: DataGraft: Data-as-a-Service for Open Data

DataGraft in SmOD: Preliminary observations

• Positive aspects– Forking/reusing transformations helped us spend less time on creating new

transformations

– Possibility to edit parameters of each transformation step and change step order at any moment of creating the transformation made it easier to:

o Create transformations in general

o Correct mistakes made during transformation steps

o Try the effects of transformation steps with different parameters

– Custom code as utility functions provided flexibility in reuse of functions across transformations

• Cleaning data lacked some "nice to have" functionality, e.g. joining or sorting datasets– This was overcome with some preprocessing of the input files (e.g. 27 of 43 files

needed some initial preprocessing in the TRAGSA pilot)

18

Page 19: DataGraft: Data-as-a-Service for Open Data

DataGraft for Property Data

Why property data?

One of the most valuable datasets managed by governments worldwide

Extensively used in various domains by private and public organizations

19

Page 20: DataGraft: Data-as-a-Service for Open Data

Some challenges in working with property data

• Difficult to access

• Cross-sectors

• Data is highly heterogeneous and possibly large

• Data quality

• Time-consuming integration

• Lack of innovation

• …

http://prodatamarket.eu 20

Page 21: DataGraft: Data-as-a-Service for Open Data

DataGraft – 1 package 2 audiences

DataGraft

Data Publisher Application Developer

Helping

publishing open

data

Giving better,

easier tools

21

Page 22: DataGraft: Data-as-a-Service for Open Data

DataGraft – targeted impacts

Reduction in costsfor organisations (e.g. SMEs, public organizations, etc.) which lack sufficient expertise and resources to publish open data

Reduction on the dependencyof open data publishers on generic Cloud platforms to build, deploy and maintain their open/linked data from scratch

Increase in the speed of publishing new datasets and updating existing datasets

Reduction in the cost and complexity of developing applications that use open data

Increase in the reuse of open data by providing reliable access to numerous open data sets to the applications hosted on DataGraft.net 22

Page 23: DataGraft: Data-as-a-Service for Open Data

Summary

• DataGraft – emerging solution (as-a-Service) for making Open (Linked) Data more accessible

– Platform, portal, methodology, APIs

– Developed/Operated by DaPaaS, with contributions from SmOD, proDataMarket, OpenCube

– Successfully applied in SmOD for two pilot cases

• Key features:

– Support for Sharable/Repeatable/Reusable Data Transformations

– Reliable RDF Database-as-a-Service

23

Page 24: DataGraft: Data-as-a-Service for Open Data

https://datagraft.net

Thank you!Contact: [email protected] 24