flink case study: okkam

A Semantic Big Data Companion

Stefano [email protected]

Flavio [email protected]

The company (briefly)

• Okkam is – a SME based in Trento, Italy. – Started as spin-off of the

University of Trento and FBK (2010)• Okkam core business is – large-scale data integration using semantic

technologies and an Entity Name System

• Okkam operative sectors– Services for public administration – Services for restaurants (and more)– Research projects

• FP7, H2020, and Local agencies

Who we are

• Stefano Bortoli, PhD– works as technical director and researcher at Okkam S.R.L.

(Trento, Italy). His research and development interests are in the area of Information Integration, with special focus in entity-centric applications exploiting semantic technologies.

• Flavio Pompermaier, MSc.– works as senior software engineer at Okkam S.R.L. (Trento, Italy).

Flavio is a passionate developer working with state of the art technologies, combining semantic with big data technologies.

What we do

Why we need Flink

Entiton data model

Database recordRDF statement

Triplestore

NOSQL& Index

+

Quad

provenance IRI

predicate

object

object Type

Subject local IRI

Subject ENS IRI

RDF Type

Expensive datawearhouse

Why we are here

• We want to build and manage (very) large entity-centric knowledge bases

• We endorsed Flink since Stratosphere as data processing framework (during DOPA FP7)

• Our use cases for Apache Flink:– Domain reasoning (Flink + Parquet + Thrift)– RDF data lifecycle (Flink + Parquet + Jena/Sesame )– RDF data intelligence (Flink + ELKiBi)– Duplicate record detection (Flink + HBase + Solr)– Entiton Record linkage (Flink + MongoDB + Kryo)– Telemetry analysis (Flink + MongoDB + Weka)

Come to our session!

• We are the last presenting, don’t let us ALONE!

• We are hiring! (maybe ;-)

flink case study: okkam

Technology