study: #big data in #austria

20
#Big Data in #Austria Big Data – Challenges and Potentials Mario Meir-Huber and Martin Köhler European Data Economy Workshop, Semantics 2015 15.09.2015, Vienna, Austria

Upload: semantic-web-company

Post on 09-Feb-2017

879 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Study: #Big Data in #Austria

#Big Data in #AustriaBig Data – Challenges and Potentials

Mario Meir-Huber and Martin Köhler

European Data Economy Workshop, Semantics 201515.09.2015, Vienna, Austria

Page 2: Study: #Big Data in #Austria

2

Study „#BigData in #Austria“ Study „#BigData in #Austria“ Project duration: 1.11.2013 – 30.04.2014

Project partners:• IDC Central Europe GmbH• AIT Austrian Institute of Technology, Mobility Department

Contact persons:• Mario Meir-Huber, IDC (Teradata)• Martin Köhler, AIT

Content:• State-of-the-Art in Big Data• Market analysis• Best practice for Big Data projects

Download (in german):• FFG „Studies of ICT of the future“: https://www.ffg.at/studien-aus-ikt-der-zukunft

#Big Data in #Austria has been funded in the funding frame „ICT of the future “ of the Austrian Research Promotion Agency (FFG) and the Austrian Ministry for Transport, Innovation and Technology (BMVIT).

Page 3: Study: #Big Data in #Austria

3

Data-intensive science

© IDC Visit us at IDC.com and follow us on Twitter: @IDCVisit the project: http://bigdataaustria.wordpress.com

Enormous data archives are at hand

Various data sources

Often available in real-time

Investigating huge data volumes and driving research and industry

Science is moving increasingly from hypothesis-driven to data-driven discoveries

Correlation vs. Causality

Page 4: Study: #Big Data in #Austria

Big Data Definition

401.05.2023

“Big Data” is a term encompassing the use of techniques to capture, process, analyse and visualize potentially large datasets in a reasonable timeframe not accessible to standard IT technologies. By extension, the platform, tools and software used for this purpose are collectively called “Big Data technologies”.NESSI White Paper, December 2012

4

Four characteristics:•Volume: In the last years the amount of generated data increased enormously

•Velocity: Analysing more data in shorter time frames

•Variety: Huge diversity of data formats (Arbitrary–> Relational > Freitext)

•Value: Extracting value (knowledge)

Hardware and software technologies for manageing and Analyzing huge amounts of data

Or simply saidIF DATA IS PART OF THE PROBLEM

Page 5: Study: #Big Data in #Austria

Big Data Dimensions

Legal dimension

Social dimension

Economic dimension

Technological dimension

Application dimension

CopyrightPrivacy

User behaviourcollaboration

Social implikations

Business modelsBenchmarking

Pricing

Scalable data processingSignal processing

StatisticsLinguistics

HCI/Visualization

Electronic archivingDecision supportIndustry solutions

01/05/20235

Page 6: Study: #Big Data in #Austria

Big Data Technology Stack

Hadoop Ecosystem

Big Data Platforms

Data Ingestion

AndProcessing

EfficiencyTrust

WorkloadGovernance

ToolsPlatform

ProgrammingParallel

Big Data Analytics

Data Science

Transformquestion toalgorithm

Machine LearningAnalysis

IntegrationQuery

PerformanceTransform

Warehousing

Big Data Utilization

DomainExpertise

Asking theright

question

Reporting & DashboardsAlerting &

Recommendations

Business Intelligence

Text Analysis and Search

01/05/20236

DataCenters

Big Data Management

Scalable Data Storage

IaaSCloud

VirtualizationNetworkComputeStorageDBMSNoSQL

Man

agem

ent

Secu

rity

Pr

ivac

y

Gov

erna

nce

Data

Value

Page 7: Study: #Big Data in #Austria

7

Big Data Management Technologies for the efficient management of huge data

amounts• Storage and management of data• Provisioning and management of the infrastructure

Cloud Ressources (Internal) Data Centers

Storage

Page 8: Study: #Big Data in #Austria

8

Big Data Platforms Technologies for (massively) parallel execution of data analytics on huge

amounts of data• Provisioning of parallelized and scalable execution systems• Real-time integration of sensor data

Massively parallel programming

Programming models for data-intensive applications

(e.g. MapReduce)

High-Level Query languages

Scripting languages and abstraction of low-level data-intensive query languages

Streaming

Real-time processing of (sensor-) data (which

can not be stored)

Ad-Hoc queries

Real-time access on huge data amounts (Query optimization – SQL vs. MapReduce)

Google PregelApache Drill

Page 9: Study: #Big Data in #Austria

9

Big Data Analytics Technologies for extracting information/knowledge from huge data amounts

• Pattern recognition• Pattern matching• .

Page 10: Study: #Big Data in #Austria

10

Big Data Utilization Technologies for extracting value

• Strengthening the market situation of an organization• Technologies for (simplified) utilization of data

Business Intelligence

Provisioning of efficient indicators based on data (Reporting, KPIs, Audit, …)

Knowledge Management

Management and representation of knowledge (Ontologies, LinkedData, Knowledge management systems)

Decision Support

Supporting decision making; incorporates data management, modelling, innovative and interactive user interfaces

Visualization

Interactive Visualization of complex informations and networks on different levels of abstractions (Visual Analytics)

Page 11: Study: #Big Data in #Austria

Traditional versus Data-intensive Approach

– 11 –

HADOOPIterate over structure

Transform and analyze

Hadoop Approach• Apply schema on read• Support range of access patterns to

data stored in HDFS: polymorphic access

Batch Interactive Real-time

Right Engine, Right Job

In-memory

Traditional Approach• Apply schema on write• Heavily dependent on IT

Determine list of questions

Design solution

Collect structured data

Ask questions from list

Detect additional questions

Single Query EngineSQL

Page 12: Study: #Big Data in #Austria

Technical and scientific challenges

Visual Analytics• Combine the strengths of human and

electronic data processing

Big Data Analytics• Techniques making use of complete data set,

instead of sampling

Real time analytics, (cross)-stream processing• Expect real-time or near real-time responses

from the systems

Content Validation• Validating the vast amount of information in

content networks, Trust

1201/05/2023

Distributed Storage (IaaS, NoSQL)

Datacenter

Parallel Stream ProcessingMapReduce Extensions

Use Cases and Enterprise Services

Scientific Data Life Sciences Business Reporting

DatacenterDatacenter

Page 13: Study: #Big Data in #Austria

13

Market analysis State-of-the-art in methods and tools

• ~50 Big data toolkits

Analysis of Austrian market participants• ~60 Austrian and internationals companies• Industry analysis

Tertiary education• Overview of Big data topics in course of

studies• Research overview

Open data portals and data sets

© IDC Visit us at IDC.com and follow us on Twitter: @IDCVisit the project: http://bigdataaustria.wordpress.com

Page 14: Study: #Big Data in #Austria

Global market IDC expects a growth of the

global market from 9,8 Billion USD in 2012 to 32,4 Billion USD in 2017

Yearly growth rate: 27%

Austrian market 2013:• ~ 23 Mio Euro

Page 15: Study: #Big Data in #Austria

Code of practice for big data projectsSupport and orientation for the impementation of big data projects

Reference projects• Medicine• Mobility• Earth observation• Crisis and disaster management• Trade

15

Process model Maturity model

Reference architecture

Page 16: Study: #Big Data in #Austria

Code of practice for big data projects

16

„We will soon have a huge skills shortage for data-related jobs.“

Neelie Kroes (ICT 2013, Nov.7, Vilnius)

„Data Scientist: The Sexiest Job of the 21st Century“http://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century/ar/1

Page 17: Study: #Big Data in #Austria

Code of practice for big data projects

17

Page 18: Study: #Big Data in #Austria

Recommendations and implications

„Data is a commodity – competence is the key“

18

Page 19: Study: #Big Data in #Austria

Adde

d Va

lue

Mar

ket L

eade

rshi

p

Loca

tion

attra

ctive

ness

Enha

nce

com

pete

nces

Visibility

Objectives

Competence

Enable data access

Legislation

Provide infrastructure

Current status

Focus, create and provide competences

Secure competences for the long-term

Establish holistic institution

Establish (international) legal certainty

Establish general framework for data markets

Incentives for Open Data

Enhance funding for SMEs

Steps

Page 20: Study: #Big Data in #Austria

20

?Mario [email protected]

Martin Köhler, [email protected]