intelligent etl-processes for geo data · 2019-04-23 · for talend real-time big data •use big...

12
Dr. Wassilios Kazakos, Head of Marketing & Business Development Intelligent ETL-Processes for Geo Data

Upload: others

Post on 04-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Dr. Wassilios Kazakos, Head of Marketing & Business Development

Intelligent ETL-Processes

for Geo Data

Page 2: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Business focus:

• Data analytics & information delivery solutions with focus on geo-spatial data

• Implementation of Data Warehouses with spatial and non-spatial data

Product Cadenza:

• Our platform for geo-analytics & data discovery

• Several thousand users in Germany and Austria

Headquarters: Karlsruhe, Germany. 100 employees.

Company Profile Company Profile

Page 3: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Project Context: WIRE (1)

WIRE: Intelligent Methods for Integration and Quality Assurance of Geo Data

• German SME research project, Funded by BMBF, Grant # 01IS16039

• Duration: 01/2017 – 12/2018

Application scenarios:

• Smart agriculture

• Environment monitoring / water management

Project Partners

• Disy Informationssysteem GmbH

• FZI, Research Cetntre for Information Technology

Page 4: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Project Context: WIRE (2)

Motivation

• Massive growth of application big geo data to be fused in real-time

• Data management and quality control needs more automation and tool support

Some methods and tools developed:

• Geospatial extension for Talend Data Integration workbench

• Machine-learning services for geo-data schema mapping

• Algorithms for detection of geo-data quality problems

• Automated corrections of geo-data errors

Today‘s topic

Mobile Sensors

Source: Yara

Source: DLR

Social Media Satelite Data

Page 5: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

What is ETL?

„E“ Extract

„T“ Transform

„L“ Load

Fiter, Analyse, Visualize

Integrated quality data

Page 6: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

What is Geo-ETL

„E“ Extract

„T“ Transform

„L“ Load

Fiter, Analyse, Visualize

Integrated quality data

+ Geometric Operations

+ Validation und correction (geometric)

+ Combination of Geo- und Non-Geodata

+ (Formats, Datenbases, Interfaces)

+ (Formats, Datenbanse, Interfaces)

Page 7: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Basis: Talend Data Integration Platform

Leading Platform for Big Data and Cloud and Data Integration

• User Interface for Job definition / code creation

• Hundreds of connectors, components and routines

• Repository management for Job-reuse and teamwork

• Monitoring and logging optimized for large installations

• Free and Open Source but still powerfull entry version

Professional Software for Big Data

and Cloud Data Integration

BUT: no Geodata and no Geo-

Operations

Page 8: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

GeoSpatial Integration for Talend

„Everything you need to extend ETL to Geo-ETL“

Extension of the

Talend Palette

Data-Connectors for

• Oracle Locator/Spatial, PostGIS

• SpatiaLite, Shapefile, WKT, WKB

• GeoJSON

Components for

• Length and area calculation

• Geometry transaformation

• Geometry validation

• Buffer, Bounding Box, Centroid….

Routines

• More than 40 geometric Operations

• Like Visvalingam-Wyatt Algorithm (Simplification)

Page 9: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

GeoSpatial Integration for Talend

Use drag & Drop with new components

GeoSpatial components

are fully integrated in

the Talend ETL process

Page 10: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

GeoSpatial Intergration

For Talend Real-time Big Data

• Use Big Data Capabilities of Talend Platform for Geodata-Processing

• Create and Deploy Spark-Code with Geo-Routines Hadoop/Spark-Clusters

Parallele execution in a

Spark-Cluster to

process Big Data

Streams

comming from Sensors,

Social Media etc.

Page 11: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

San Francisco Municipal Transort

Authority

• All Vehicle send radio signals every

minute

• Metro (MUNI), Busses, Car- and

Bicycle-sharing

• Data are stored in a Hadoop Data

Lake

• Talend Big Data Platform +

GeoSpatial Plugin for Talend

• Currently: mainly data analysis.

• Vision: Real-time data services for

other departments, the public &

companies

Case Study

San Francisco

Page 12: Intelligent ETL-Processes for Geo Data · 2019-04-23 · For Talend Real-time Big Data •Use Big Data Capabilities of Talend Platform for Geodata-Processing •Create and Deploy

Contact

Dr. Wassilios Kazakos

Disy Informationssysteme GmbH

Tel. +49 721 16006-000

[email protected]

You can download and

try the plug-in for free

here…

www.disy.net/spatial-etl

Intelligent ETL-Processes for Geo-Data

Thank you for your attention