bde sc6 workshop - introduction 2016

27
THE BIG DATA EUROPE PROJECT: STATUS & NEXT STEPS SC6 Workshop, Cologne 05 December 2016

Upload: bigdataeurope

Post on 15-Jan-2017

19 views

Category:

Data & Analytics


0 download

TRANSCRIPT

THE BIG DATA EUROPE PROJECT:

STATUS & NEXT STEPS

SC6 Workshop, Cologne05 December 2016

Supporting the Societal Domains with Big Data Technology

BigDataEurope Project

1 mai 2023www.big-data-europe.eu

Stakeholder Engagement Cycle

Present action, showcase deployments

Raise awareness about BDE results, what they mean for stakeholders

Collect requirements to drive further development

1 mai 2023www.big-data-europe.eu

M12M6 M18 M24 M30

Data Value Chain Evolution

1 mai 2023

Extraction, Curation Quality, Linking, Integration

Publication, Visualization, Analysis

Extraction, Curation, Quality, Linking, Integration, Publication,

Visualization, Analysis

HealthTransport

Security

Extraction Curation Quality Linking Integration Publication Visualization Analysis

Data Repositori

es

Linked Open Data

TIME

Food SocietiesClimate EnergyProprietary, ‘locked-in’solutions

OS Solutions,Big Data Stacks

www.big-data-europe.eu

A flexible, generic platform for (Big) Data Value Chain Deployment

Big Data Integrator

1 mai 2023www.big-data-europe.eu

Big Data Integrator Prototype developed by BDE

o Incorporates existing BD technologyo Facilitates integration and deployment

Main points of the architectureo Dockerizationo Support layer, including integrated UIo Semantification layer

1 mai 2023www.big-data-europe.eu

Generic Architecture

1 mai 2023www.big-data-europe.eu

Plug-and-play BD Platform

Cloud-deployment ready

Domain independent, Customisable

Stacks Open Source solutions BDI Prototype

Releases1. [July 2016]2. December 20163. ….

Demonstrating the Societal Value through 7 Pilot ‘Real-world’ use-cases

BigDataEurope Pilots

1 mai 2023www.big-data-europe.eu

7 Pilots◎ BDI Platform Instantiations

o Allow end-users to easily deploy functionality in own system environment

o Modularized Docker approach - easier to replace components

o Reduces effort to keep 3rd party software updated & integrated

◎ 7 Societal Challenge Pilots o Aligned with 7 European Commision H2020 Societal

Challengeso Real-world use-cases (Data, Objectives, Solutions)o Some pilots have different data & objectives but a

similar solution

1 mai 2023www.big-data-europe.eu

SC1: Pharmacology research

1 mai 2023

www.big-data-europe.eu

Life Science

s & Health

• Query a large number of datasets, some large

• Existing elaborate ingestion and homogenization by OpenPHACTS

• Extensive toolset developed by OPF and others

Objective: Large-scale heterogeneous pharma-research data linking & integration

SC1: Architecture & Components

1 mai 2023www.big-data-europe.eu

• Replicate Open PHACTS functionality on the BDE infrastructure using OS solutions• Based on Virtuoso,

proprietary distributed database

• Apply to other domains (e.g. Agriculture)

• Porting to BDI gives flexibility and enables new functionalities• Logging & system health

monitoring

SC2: Viticulture resources

1 mai 2023www.big-data-europe.eu

Food and

Agriculture

Objective: Automate publication ingestion and thematic classification• AgInfra is a

major infrastructure for agriculture researchers, serving cross-linked bibliography, data, and processing services

www.big-data-europe.eu

SC2: Architecture & Components

• BDI deployed as an external infrastructure for processing text (viticulture publications)

• Storing and processing text at a larger scale than AgInfra can currently manage

SC3: Predictive maintenance

1 mai 2023www.big-data-europe.eu

Energy

• Wind turbine monitoring applies computational models to sensor data streams

• Models are weekly re-parameterized using week’s data from multiple turbines

Objective: Real-time turbine monitoring stream processing and analytics

www.big-data-europe.eu

• Existing in-house non-scalable solution for model parameterization• Reliable Fortran software for data

analysis• Efficient, but not scalable to data

volume

• Developing a BDI orchestrator• Re-uses existing software unmodified• Makes it easy to apply in parallel to

many datasets and manage the outputs

SC3: Architecture & Components

SC4: Traffic conditions estimation

1 mai 2023www.big-data-europe.eu

Transport

• Combines:• Traffic modelling from

historical data• Current measurements

from a taxi fleet of 1200 vehicles

Objective: Estimation of real-time traffic conditions in Thessaloniki

1 mai 2023www.big-data-europe.eu

• New Flink implementations of map matching and traffic prediction algorithms

• BDI provides access to varied data sources• PostGIS database

with city map• ElasticSearch

database of historical data

• Kafka stream of real-time data

SC4: Architecture & Components

SC5: Climate modelling

1 mai 2023www.big-data-europe.eu

Climate

• Preparing modelling experiments• Slicing, transforming, combining

datasets• Submission and retrieval from

modelling infrastructure• Discovering and re-using

previously computed derivatives• Lineage annotation: computer

derivatives from datasets and model parameters

• Finding appropriate past runs avoids repeating weeks-long modelling runs

Objective: Supporting data-intensive climate research

• BDI offers:• Hive for managing

data in a way that can be retrieved and manipulated, rather than file blocks

• Cassandra stores structured and textual metadata for searching headers and lineage

• Existing infrastructure; stable, reliable software for parallel computation of models

• BDI is deployed as an external infrastructure for preparing and managing datasets

SC5: Architecture & Components

SC6: Municipality budgets

1 mai 2023www.big-data-europe.eu

Social Science

s

• Ingestion of budget and budget execution data

• Multiple municipalities in varied formats and data models

Objective: Homogenized Budgetary data made available for analysis and

comparison

1 mai 2023www.big-data-europe.eu

• BDI deployed as ingestion and storage infrastructure for external tools• Homogenizes

variety of data (JSON, CSV, XML, etc.)

• Exposes data as SPARQL endpoint serving homogenized data

• Existing analytics and visualization tools• Use SPARQL queries to retrieve only the relevant slices of the overall

data

SC6: Architecture & Components

SC7: Change detection & verification

1 mai 2023www.big-data-europe.eu

Secure Societie

s

• Events are extracted from text published by news agencies and on social networking sites

• Events are geo-located and relevant changes are detected by comparing current and previous satellite images

Objective: Detect and Verify Events based on Satellite Imagery, News and

Social Media

1 mai 2023www.big-data-europe.eu

Event Detection

Change Detection

• Re-implementation of change detection algorithms for Spark

• Parallel orchestrator for text analytics• Re-uses existing software• Scales to many input streams

• BDI provides:• Cassandra for text content

and metadata• Strabon GIS store for

detected change location• Homogeneous access to both

for analysis and visualization

SC7: Architecture & Components

Free Workshops, Hangouts & WebinarsBigDataEurope Activities

1 mai 2023www.big-data-europe.eu

2nd round of Societal Workshops

1 mai 2023www.big-data-europe.eu

Transport 22 September 2016 Brussels

Collocated with Big Data for Transport, Tisa workshop

Food&Agri

30 September 2016 Brussels

Collocated with DG AGRI WP2018-20 stakeholder consultation

Energy 4 October 2016 Brussels

Collocated with EC H2020 Info Day on “Smart Grids and Storage”

Climate 11 October 2016 Brussels

Collocated with Melodies Project Event – Exploiting Open Data

Security 18 October 2016 Brussels

Standalone Workshop

Societies 5 December 2016 Cologne

Collocated with EDDI16- 8th Annual European DDI User Conference

Health 9 December 2016 Brussels

Standalone Workshop

Other Activities Fresh set (7) of Societal Workshops in

2017 Various SC-focussed and general

hangouts, follow!o General (technical): 2 this year More to

follow!o SC6: 2 so far, next in the next weekso Recordings & Presentations available

online!o Keep track on BDE Website (Events)

1 mai 2023www.big-data-europe.eu

WEB: www.big-data-europe.eu EMAIL: [email protected]

BIG DATA INTEGRATOR www.github.com/big-data-europe

PROJECT COORDINATION (Fraunhofer IAIS)Prof. Sören Auer, auer © cs.uni-bonn · de > Dr. Simon Scerri, scerri © cs.uni-bonn · deEIS Department/Group,Fraunhofer IAIS & CS Department Uni-Bonn, Bonn, Germany

Questions & Contactswww.big-data-europe.eu

1 mai 2023#BigDataEurope

leads the Fraunhofer Big Data Alliance

Please follow our project via

Website to receive event invites & newsletter (SC6 only, or

otherwise).