bde sc6 workshop - introduction 2016
TRANSCRIPT
Supporting the Societal Domains with Big Data Technology
BigDataEurope Project
1 mai 2023www.big-data-europe.eu
Stakeholder Engagement Cycle
Present action, showcase deployments
Raise awareness about BDE results, what they mean for stakeholders
Collect requirements to drive further development
1 mai 2023www.big-data-europe.eu
M12M6 M18 M24 M30
Data Value Chain Evolution
1 mai 2023
Extraction, Curation Quality, Linking, Integration
Publication, Visualization, Analysis
Extraction, Curation, Quality, Linking, Integration, Publication,
Visualization, Analysis
HealthTransport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data Repositori
es
Linked Open Data
TIME
Food SocietiesClimate EnergyProprietary, ‘locked-in’solutions
OS Solutions,Big Data Stacks
www.big-data-europe.eu
A flexible, generic platform for (Big) Data Value Chain Deployment
Big Data Integrator
1 mai 2023www.big-data-europe.eu
Big Data Integrator Prototype developed by BDE
o Incorporates existing BD technologyo Facilitates integration and deployment
Main points of the architectureo Dockerizationo Support layer, including integrated UIo Semantification layer
1 mai 2023www.big-data-europe.eu
Generic Architecture
1 mai 2023www.big-data-europe.eu
Plug-and-play BD Platform
Cloud-deployment ready
Domain independent, Customisable
Stacks Open Source solutions BDI Prototype
Releases1. [July 2016]2. December 20163. ….
Demonstrating the Societal Value through 7 Pilot ‘Real-world’ use-cases
BigDataEurope Pilots
1 mai 2023www.big-data-europe.eu
7 Pilots◎ BDI Platform Instantiations
o Allow end-users to easily deploy functionality in own system environment
o Modularized Docker approach - easier to replace components
o Reduces effort to keep 3rd party software updated & integrated
◎ 7 Societal Challenge Pilots o Aligned with 7 European Commision H2020 Societal
Challengeso Real-world use-cases (Data, Objectives, Solutions)o Some pilots have different data & objectives but a
similar solution
1 mai 2023www.big-data-europe.eu
SC1: Pharmacology research
1 mai 2023
www.big-data-europe.eu
Life Science
s & Health
• Query a large number of datasets, some large
• Existing elaborate ingestion and homogenization by OpenPHACTS
• Extensive toolset developed by OPF and others
Objective: Large-scale heterogeneous pharma-research data linking & integration
SC1: Architecture & Components
1 mai 2023www.big-data-europe.eu
• Replicate Open PHACTS functionality on the BDE infrastructure using OS solutions• Based on Virtuoso,
proprietary distributed database
• Apply to other domains (e.g. Agriculture)
• Porting to BDI gives flexibility and enables new functionalities• Logging & system health
monitoring
SC2: Viticulture resources
1 mai 2023www.big-data-europe.eu
Food and
Agriculture
Objective: Automate publication ingestion and thematic classification• AgInfra is a
major infrastructure for agriculture researchers, serving cross-linked bibliography, data, and processing services
www.big-data-europe.eu
SC2: Architecture & Components
• BDI deployed as an external infrastructure for processing text (viticulture publications)
• Storing and processing text at a larger scale than AgInfra can currently manage
SC3: Predictive maintenance
1 mai 2023www.big-data-europe.eu
Energy
• Wind turbine monitoring applies computational models to sensor data streams
• Models are weekly re-parameterized using week’s data from multiple turbines
Objective: Real-time turbine monitoring stream processing and analytics
www.big-data-europe.eu
• Existing in-house non-scalable solution for model parameterization• Reliable Fortran software for data
analysis• Efficient, but not scalable to data
volume
• Developing a BDI orchestrator• Re-uses existing software unmodified• Makes it easy to apply in parallel to
many datasets and manage the outputs
SC3: Architecture & Components
SC4: Traffic conditions estimation
1 mai 2023www.big-data-europe.eu
Transport
• Combines:• Traffic modelling from
historical data• Current measurements
from a taxi fleet of 1200 vehicles
Objective: Estimation of real-time traffic conditions in Thessaloniki
1 mai 2023www.big-data-europe.eu
• New Flink implementations of map matching and traffic prediction algorithms
• BDI provides access to varied data sources• PostGIS database
with city map• ElasticSearch
database of historical data
• Kafka stream of real-time data
SC4: Architecture & Components
SC5: Climate modelling
1 mai 2023www.big-data-europe.eu
Climate
• Preparing modelling experiments• Slicing, transforming, combining
datasets• Submission and retrieval from
modelling infrastructure• Discovering and re-using
previously computed derivatives• Lineage annotation: computer
derivatives from datasets and model parameters
• Finding appropriate past runs avoids repeating weeks-long modelling runs
Objective: Supporting data-intensive climate research
• BDI offers:• Hive for managing
data in a way that can be retrieved and manipulated, rather than file blocks
• Cassandra stores structured and textual metadata for searching headers and lineage
• Existing infrastructure; stable, reliable software for parallel computation of models
• BDI is deployed as an external infrastructure for preparing and managing datasets
SC5: Architecture & Components
SC6: Municipality budgets
1 mai 2023www.big-data-europe.eu
Social Science
s
• Ingestion of budget and budget execution data
• Multiple municipalities in varied formats and data models
Objective: Homogenized Budgetary data made available for analysis and
comparison
1 mai 2023www.big-data-europe.eu
• BDI deployed as ingestion and storage infrastructure for external tools• Homogenizes
variety of data (JSON, CSV, XML, etc.)
• Exposes data as SPARQL endpoint serving homogenized data
• Existing analytics and visualization tools• Use SPARQL queries to retrieve only the relevant slices of the overall
data
SC6: Architecture & Components
SC7: Change detection & verification
1 mai 2023www.big-data-europe.eu
Secure Societie
s
• Events are extracted from text published by news agencies and on social networking sites
• Events are geo-located and relevant changes are detected by comparing current and previous satellite images
Objective: Detect and Verify Events based on Satellite Imagery, News and
Social Media
1 mai 2023www.big-data-europe.eu
Event Detection
Change Detection
• Re-implementation of change detection algorithms for Spark
• Parallel orchestrator for text analytics• Re-uses existing software• Scales to many input streams
• BDI provides:• Cassandra for text content
and metadata• Strabon GIS store for
detected change location• Homogeneous access to both
for analysis and visualization
SC7: Architecture & Components
2nd round of Societal Workshops
1 mai 2023www.big-data-europe.eu
Transport 22 September 2016 Brussels
Collocated with Big Data for Transport, Tisa workshop
Food&Agri
30 September 2016 Brussels
Collocated with DG AGRI WP2018-20 stakeholder consultation
Energy 4 October 2016 Brussels
Collocated with EC H2020 Info Day on “Smart Grids and Storage”
Climate 11 October 2016 Brussels
Collocated with Melodies Project Event – Exploiting Open Data
Security 18 October 2016 Brussels
Standalone Workshop
Societies 5 December 2016 Cologne
Collocated with EDDI16- 8th Annual European DDI User Conference
Health 9 December 2016 Brussels
Standalone Workshop
Other Activities Fresh set (7) of Societal Workshops in
2017 Various SC-focussed and general
hangouts, follow!o General (technical): 2 this year More to
follow!o SC6: 2 so far, next in the next weekso Recordings & Presentations available
online!o Keep track on BDE Website (Events)
1 mai 2023www.big-data-europe.eu
WEB: www.big-data-europe.eu EMAIL: [email protected]
BIG DATA INTEGRATOR www.github.com/big-data-europe
PROJECT COORDINATION (Fraunhofer IAIS)Prof. Sören Auer, auer © cs.uni-bonn · de > Dr. Simon Scerri, scerri © cs.uni-bonn · deEIS Department/Group,Fraunhofer IAIS & CS Department Uni-Bonn, Bonn, Germany
Questions & Contactswww.big-data-europe.eu
1 mai 2023#BigDataEurope
leads the Fraunhofer Big Data Alliance
Please follow our project via
Website to receive event invites & newsletter (SC6 only, or
otherwise).