big data e tecnologie semantiche - utilizzare i linked data come driver d'integrazione di dati
TRANSCRIPT
![Page 1: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/1.jpg)
Big Data e tecnologie semantiche - Utilizzare i Linked Data come driver d'integrazione di dati
Giuseppe FutiaNexa Center for Internet and Society, Politecnico di Torino (DAUIN)
27 July 2016
![Page 2: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/2.jpg)
Outline
• Information management challenges and Big Data
• Linked Data framework (explained with examples)
• Linked Data approach for Big Data community
• The impact of Big Structured Data
![Page 3: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/3.jpg)
Enterprise/Research Information Management Challenges
• Disparate data sources and data silos
• Data sources with similar/inconsistent information
• Most of the knowledge is hidden in texts (unstructured data)
• Difficult to integrate and analyse structured and unstructured data
![Page 4: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/4.jpg)
The 3 V’s of Big Data
• Velocity
• Volume
• Variety
![Page 5: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/5.jpg)
The 3 V’s of Big Data
• Velocity
• Volume
• Variety (Veracity and Value)
![Page 6: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/6.jpg)
From Big Linked Data toLinked Big Data
![Page 7: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/7.jpg)
Big Linked Data
![Page 8: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/8.jpg)
Linked Data Cloud Diagram (2014)
Big Linked Data
![Page 9: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/9.jpg)
Linked Data Vision (W3C)• Extend principles of the Web from documents to data
• Data should be accessed using the general Web architecture (e.g., URIs, HTTP, …)
• Data should be linked each other just as documents
• Creation of a common framework that allows:– Data to be shared and reused across applications– Data to be processed automatically– New relationships between pieces of data to be
inferred
![Page 10: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/10.jpg)
Resource Description Framework • Everything is a triple – Subject (resource), Predicate
(relation), Object (resource or literal)
•The Resource Description Framework (RDF) graph is a collection of triples predicate subject object
![Page 11: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/11.jpg)
SPARQL
11
• SQL-like query language for RDF data
• Simple protocol for querying remote databases over HTTP
• Query types– select: query data by complex graph pattern– ask: whether a query returns results (result is true/false)– describe: returns all triples about a particular resource– construct: create new triples based on query results
![Page 12: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/12.jpg)
Nexa projects
![Page 13: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/13.jpg)
Contratti pubblici
![Page 14: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/14.jpg)
Le PEC dei comuni italiani con più di 100 mila abitanti che pubblicano contratti con anomalie
![Page 15: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/15.jpg)
?
![Page 16: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/16.jpg)
![Page 17: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/17.jpg)
![Page 18: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/18.jpg)
![Page 19: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/19.jpg)
![Page 20: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/20.jpg)
![Page 21: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/21.jpg)
![Page 22: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/22.jpg)
![Page 23: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/23.jpg)
![Page 24: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/24.jpg)
![Page 25: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/25.jpg)
![Page 26: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/26.jpg)
TellMeFirstA Knowledge Discovery Application
![Page 27: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/27.jpg)
TellMeFirst Architecture http://tellmefirst.polito.it
![Page 28: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/28.jpg)
“The final work of legendary director Stanley Kubrick, who died within a week of completing the edit, is based upon a novel by Arthur Schnitzler. Tom Cruise and Nicole Kidman play William and Alice Harford, a physician and a gallery manager who are wealthy, successful, and travel in a sophisticated social circle.”
![Page 29: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/29.jpg)
![Page 30: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/30.jpg)
Linked Big Data
![Page 31: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/31.jpg)
Linked Data approach adopted by the Big Data community
• RDF data model for Variety– Flexible, easy to evolve data model– Efficiently integrate structured and unstructured data
• Enrich Big Data with metadata and semantics–More powerful analytics on top of it–Discover implicit links and relationships
• Interlink Big Data sets–Information interchange across a value chain
![Page 32: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/32.jpg)
Semantic technologies for Big Data
![Page 33: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/33.jpg)
Blazegraph and DASL• Blazegraph is a high performance graph database platform
that supports RDF/SPARQL APIs
• In 2016 Blazegraph introduced a programming environment called DASL
• DASL supports the development of graph algorithms within the Apache Spark ecosystem specifically optimised for GPUs
• Complex graph analytic environments, especially where relationships are unknown in advance
![Page 34: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/34.jpg)
EP-SPARQL• Event processing provides on-the-fly analysis of event
streams, but cannot combine streams with background knowledge and cannot performing reasoning tasks
• Semantic tools can effectively handle background knowledge and perform reasoning tasks, but cannot deal with rapidly changing data provided by event streams
• Event Processing SPARQL (EP-SPARQL) as a new language for complex event and stream reasoning
![Page 35: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/35.jpg)
The impact of Big Structured Data
![Page 36: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/36.jpg)
Google Knowledge Graph Freebase-to-Wikidata transition
![Page 37: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/37.jpg)
Facebook’s Social Graph(in 2013)
The Graph API is the primary way to get (our) data in and out
of Facebook's social graph
![Page 38: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/38.jpg)
Facebook Web is progressively smarter than the Web of data…
![Page 39: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/39.jpg)
Open source, libre contents, and linked data as a framework to build an open linked big data graph
![Page 40: Big Data e tecnologie semantiche - Utilizzare i Linked data come driver d'integrazione di dati](https://reader031.vdocuments.site/reader031/viewer/2022011722/58ee1c7f1a28abec1b8b45ef/html5/thumbnails/40.jpg)
Grazie!
Repository GitHubhttps://github.com/giuseppefutia/