flink case study: okkam
TRANSCRIPT
The company (briefly)
• Okkam is – a SME based in Trento, Italy. – Started as spin-off of the
University of Trento and FBK (2010)• Okkam core business is – large-scale data integration using semantic
technologies and an Entity Name System
• Okkam operative sectors– Services for public administration – Services for restaurants (and more)– Research projects
• FP7, H2020, and Local agencies
Who we are
• Stefano Bortoli, PhD– works as technical director and researcher at Okkam S.R.L.
(Trento, Italy). His research and development interests are in the area of Information Integration, with special focus in entity-centric applications exploiting semantic technologies.
• Flavio Pompermaier, MSc.– works as senior software engineer at Okkam S.R.L. (Trento, Italy).
Flavio is a passionate developer working with state of the art technologies, combining semantic with big data technologies.
What we do
Why we need Flink
Entiton data model
Database recordRDF statement
Triplestore
NOSQL& Index
+
Quad
provenance IRI
predicate
object
object Type
Subject local IRI
Subject ENS IRI
RDF Type
Expensive datawearhouse
Why we are here
• We want to build and manage (very) large entity-centric knowledge bases
• We endorsed Flink since Stratosphere as data processing framework (during DOPA FP7)
• Our use cases for Apache Flink:– Domain reasoning (Flink + Parquet + Thrift)– RDF data lifecycle (Flink + Parquet + Jena/Sesame )– RDF data intelligence (Flink + ELKiBi)– Duplicate record detection (Flink + HBase + Solr)– Entiton Record linkage (Flink + MongoDB + Kryo)– Telemetry analysis (Flink + MongoDB + Weka)
Come to our session!
• We are the last presenting, don’t let us ALONE!
• We are hiring! (maybe ;-)