Download - Data-as-a-Service: DataGraft
![Page 2: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/2.jpg)
2
“Data is the new oil”…but many of us just need gasoline
Data-as-a-Service …is the new filling station
![Page 3: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/3.jpg)
Data-as-a-Service
• Outsourcing of various data operations to the cloud
• Eliminates
– upfront costs on data infrastructure
– ongoing investment of time and resources in managing the data infrastructure
• Complete package for
– transformation of raw data into meaningful data assets
– reliable delivery of data assets
3
![Page 4: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/4.jpg)
Example #1: Using open data – petroleum activities on the Norwegian continental shelf
4
• ~70 tabular datasets• Difficult to query across
tables, integrate with other data, e.g. Business Registry
• Simplified integration with external datasets
• Distribution of integrated dataset• Live service• Reliable access• …
• Which companies have been owners in license X?
• What is the oil production for each field in year X?
• What is the total production of the top 10 companies by number of employees in year X?
• ....
Integration and querying service
Tabular data on the Web
Data Insights
factpages.npd.no data.brreg.no/oppslag/enhetsregisteret
![Page 5: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/5.jpg)
Example #2: Reporting state-owned real estate properties in Norway
• A hard copy of 314 pages and as a PDF file
• 6 Person-Months• Data collection with spreadsheets• Quality assurance through e-mails
and phone correspondence
Pains• Time consuming• Poor data quality• Static report without live updating
• Live service• Efficient sharing of data• Simplified integration with external
datasets• Live updating• Reliable access• …
• Risk and vulnerability analysis, e.g. buildings affected by flooding
• Analysis of leasing prices
Report Reporting Service 3rd party services
5
![Page 6: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/6.jpg)
Sample data
6
Cleaning, Transformation, Publishing,
Integration, Querying, Visualization,
Service Access
![Page 7: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/7.jpg)
7
Example #3: Personalized and Localized Urban Quality Index (PLUQI)
The index includes data from various domains:
Daily life satisfaction weather, transportation, community,…
Healthcare level number of doctors, hospitals, suicide statistics,…
Safety and security number of police stations, fire stations, crimes per capita,…
Financial satisfaction prices, incomes, housing, savings, debt, insurance, pension,…
Level of opportunity jobs, unemployment, education, re-education,…
Environmental needs and efficiency green space, air quality,…
![Page 8: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/8.jpg)
Sample data
8
![Page 9: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/9.jpg)
was developed to allow
data workers to manage their data in a
simple, effective, and efficient way
Powerful
data transformation and
reliable data access capabilities
9
DataGraft
![Page 10: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/10.jpg)
Tabular Data Graph Data
• Open Data is mostly tabular data
• Excel, CSV, TSV, etc.
• Records organized in silos of collections
• Very few links within and/or across
collections
• Difficult to understand the nature of the
data
• Difficult to integrate / query
Based on Linked Data• Method for publishing data on the Web
• Self-describing data and relations
• Interlinking
• Accessed using semantic queries
• Open standards by W3C− Data format: RDF
− Knowledge representation: RDFS/OWL
− Query language: SPARQL
http://www.w3.org/standards/semanticweb/data
europeandataportal.eu
10
![Page 11: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/11.jpg)
Data Transformation and RDF Publication Process
• Interactive design of transformations?
• Repeatable transformations?
• Reuse/share transformations (user-based access)?
• Cloud-based deployment of transformations?
• Self-serviced process?
• Data and Transformation as-a-Service? 11
Semantic graph
database
![Page 12: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/12.jpg)
Tabular Data
GraphData
DataGraft: Data-as-a-ServiceFor the Data Transformation and RDF Publication Process
12
![Page 13: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/13.jpg)
13
https://www.ssb.no/statistikkbanken
Example: Using statistical data
![Page 14: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/14.jpg)
14
![Page 15: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/15.jpg)
![Page 16: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/16.jpg)
![Page 17: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/17.jpg)
![Page 18: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/18.jpg)
![Page 19: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/19.jpg)
![Page 20: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/20.jpg)
![Page 21: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/21.jpg)
![Page 22: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/22.jpg)
![Page 23: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/23.jpg)
![Page 24: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/24.jpg)
![Page 25: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/25.jpg)
![Page 26: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/26.jpg)
![Page 27: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/27.jpg)
![Page 28: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/28.jpg)
![Page 29: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/29.jpg)
![Page 30: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/30.jpg)
30
![Page 31: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/31.jpg)
31
![Page 32: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/32.jpg)
32
Data records (rows)
Add rowTake row(s)Drop row(s)
Shift rowFilter rows (grep)
Remove duplicate rows
Entire datasetSort
Reshape datasetGroup (categorize) and aggregate
Columns
Add column(s)Take column(s)Drop column(s)Move column
Merge columnsSplit column
Rename column(s)Apply function to all values in a column
![Page 33: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/33.jpg)
33
![Page 34: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/34.jpg)
34
![Page 35: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/35.jpg)
35
![Page 36: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/36.jpg)
36
![Page 37: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/37.jpg)
37
![Page 38: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/38.jpg)
Data pages and federated querying
38
What is the population of locations and total number of persons employed in Human health and social work activities?
![Page 39: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/39.jpg)
Configuring data visualizations
39
![Page 40: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/40.jpg)
40
![Page 41: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/41.jpg)
41
![Page 42: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/42.jpg)
42
![Page 43: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/43.jpg)
43
APIs
![Page 44: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/44.jpg)
DataGraft key feature: Flexible management and sharing of data
and transformations
Fork, reuse and extend transformations built by other professionals from DataGraft’s
transformations catalog
Interactively build, modify and share data
transformations
Share transformations privately or publicly
Reuse transformations to repeatably clean and
transform spreadsheet data
Programmatically access transformations and the transformation catalogue
44
![Page 45: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/45.jpg)
Reuse of transformations in environmental data publishing
TRAGSA Pilot
• Number of transformations: 42
– Created via reuse: 25
• Number of triples:
– ~ 7.7M
ARPA Pilot
• Number of transformations: 5
– Created via reuse: 2
• Number of triples:
– ~ 14K
45
Forking/reusing transformations helped us spend less time on creating new transformations
![Page 46: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/46.jpg)
DataGraft key feature: Reliable data hosting and querying services
Host data on DataGraft’sreliable, cloud-based
semantic graph database
Share data privately or publicly
Query data through your own SPARQL
endpoint
Programmatically access the data
catalogue
46
Operations & maintenance performed on behalf of users
![Page 47: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/47.jpg)
Grafter Grafterizer
Semantic Graph DBaaSData Portal
DataGraft
47
DataGraft Enablers
![Page 48: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/48.jpg)
DataGraft – 1 package 2 audiences
DataGraft
Data Publisher Application Developer
Helping integrating and publishing data
Giving better, easier tools
48
![Page 49: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/49.jpg)
DataGraft – targeted impacts
Reduction in costsfor organisations which lack sufficient expertise and resources to make their data available
Reduction on the dependencyof data owners on generic Cloud platforms to build, deploy and maintain their linked data from scratch
Increase in the speed of publishing new datasets and updating existing datasets
Reduction in the cost and complexity of developing applications that use data
Increase in the reuse of data by providing reliable access to numerous datasets hosted on DataGraft.net
49
![Page 50: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/50.jpg)
• Gathering enough of good datasets
• Designing/implementing
2. Able to focus onservice quality
Example: The benefit of DataGraft in PLUQI
50
• Reducing cost for implementing transformations
• Integrating the process is simpler
1. 23% of developmentcost reduction
Datasetsgathering
Datatransformation
Data provisioning/access
ImplementingApp
Before
Datasetsgathering
Datatransformation
Data provisioning/
access
ImplementingApp
After (with DataGraft)
![Page 51: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/51.jpg)
DataGraft in numbers (as of end of Jan 2016)
51
238Registered users
607 (208 public)
Registered Data transformations
1828Uploaded files
192Public Data
pages
![Page 52: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/52.jpg)
DataGraft in the wild
• Investigating crime data in small geographies
• Used DataGraft to transform data and publish RDF
52http://benproctor.co.uk/investigating-crime-data-at-small-geographies/
![Page 53: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/53.jpg)
Data Science and DataGraft
Greater Data Science:
1. Data Exploration and Preparation
2. Data Representation and Transformation
3. Computing with Data
4. Data Visualization and Presentation
5. Data Modeling
6. Science about Data Science53
“50 years of Data Science” by David Donohohttp://courses.csail.mit.edu/18.337/2015/docs/50YearsDataScience.pdf
DataGraft
![Page 54: Data-as-a-Service: DataGraft](https://reader031.vdocuments.site/reader031/viewer/2022030210/58a46cc61a28aba34c8b45e5/html5/thumbnails/54.jpg)
Summary
• DataGraft – emerging Data-as-a-Service solution for making (linked) data more accessible
– Platform, portal, methodology, APIs
– Online service, functional and documented
– Validated through several use cases
• Key features:
– Support for Sharable/Repeatable/Reusable Data Transformations
– Reliable RDF Database-as-a-Service
54