ditas project presentation v1.0
TRANSCRIPT
DITAS Project
Introduction
Context
Data-Intensive Applications
Easy to implement
scalable solutions
Data management is
totally demanded to the
cloud providers
Robust solutions
Latency could be
significant
No control over data
management
(security/privacy
compliance checking)
Total control over the
data management
Data latency reduced
Higher investments
No scalability
on Cloud on the Edge
Mixing the two
Cloud On premise
Scalability
Reliability
Latency
Security
Compliance
Scalability
Reliability
Latency
Security
Compliance
DITAS
DITAS Objectives
1. To improve the productivity of developers
2. To enhance the data management in mixed cloud/fog
environment
3. To improve the execution of data-intensive applications
through data movement strategies
4. To provide an innovative execution environment for data-
intensive applications
5. To maximize the impact on the market of developers and
adopters of data-intensive applications
What is a DIA?
We call an application data-intensive if data is
its primary challenge:
the quantity of data,
the complexity of data,
or the speed at which it is changing [5]
Two perspectives:
Data production
Data consumption
Our position
Source [5]
Our case studies
Characteristic Industry 4.0 e-Health
Data size Depending on the sensors attached to
the machine, it generates 100MB –
600MB per day
GigaByte if characters, Terabyte if
images
Complexity of analysis TBD Taxonomy problems and source
hetherogeneity increase the analysis
complexity
Heterogeneity of data Low High
Number of data sources Between 1 and 3 Several , depending upon the
domain; on average 5 to 7 sources
Distribution of data Not distributed Depending upon the localization of
the data sources. For multicentric
clinical studies the sources may be
located in different , geographically
distinct institutions
Timeliness of processing Needs can vary from near real-time to
hours depending on the
service/application
No online query are required,
therefore the response time is not an
issue for complex analysis.
s
s
s
ss
s
Our model
Store
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
Our model
Data sources
Data format
Transmission
protocols
s
s
s
ss
sStore
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
Our model
Where to store
DFS with IoT
Security/privacy
s
s
s
ss
sStore
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
Our model
Data analytics /
operational processes
Lambda/Kappa
architecture
s
s
s
ss
sStore
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
Our model
Delete
Freezing
s
s
s
ss
sStore
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
Dismiss
Our model
Data movement
is covered in all
the phasesStore
Acquire
Consume
Dismiss
help you on your daily tasks on Data
Management
About the results
Expected results
DITAS-SDK
To design and implement data-intensive applications
DITAS Execution Environment (DITAS-EE)
To run data-intensive applications where data movement is
working behind the stage
Data and computation movement techniques
To satisfy the developer requirements
Strategies for application improvement
To be sure that data movement strategies are properly
working
Solution at a glance
DITAS
SDK
Device and cloud manager
Identity manager
Application deployment
manager
DITAS Execution
EnvironmentDITAS Execution
Environment
DITAS Execution
Environment
DITAS Execution
Environment
DITAS as a service…
DITAS developer
PaaS
(design) (execution)
IaaS• Hide the complexity of the infrastructure
• Where the application is deployed
• What the developer uses
• Supports the design of DIA
• Enable the data movement
IaaS in DITAS
IaaS
Common Accessible Framework
Virtual access to
data in the
federated storedData in the cloud store
PaaS
PaaS in DITAS (Design) -
SDK
Data Utility definitionData Security and
Privacy
Virtual Data Container Definition
Application Profiling and Deployment strategies)
PaaS
PaaS in DITAS (Execution
Environemnt – DitasEE)
Data movement
strategies
Data analytics
Execution Engine
Data movement enactor
Data monitoring
About the work plan
Milestones
General architecture
design and case studies
refinement
Detailed component architecture
First releaseFinal release and mature
market analysis
Final validation, joint
sustainability, and established
sustainability body
M6MS1 M12MS2 M18MS3 M30MS4 M36MS5
WP1 WP2
WP3
WP4
WP1
WP2
WP3
WP4
WP5
WP2
WP3
WP4
WP5
WP5
Iteration 1 Iteration 2
How DITAS can help your
workflow
Phase 1 – system knowledge
Device profiling
How to access
What is able to do
Data profiling
Which are the data
consumed/produced
Characteristics of the
data IaaS
Phase 2 - development
Developer defines the data-processing pipeline (process) Data are seen as VDC
Agnostic from the platform (unless data does not exist)
Developer defines the QoS At which granularity
(process/function)
DITAS
SDK
IaaS
PaaS
Phase 3 – deployment
Based on the system system knowledge functions are properly deployed
Balance between cost/benefits deployment
Questions: Do all the nodes know
the complete process?
Do we need to have a central node?
Application deployment
manager
DITAS Execution
EnvironmentDITAS Execution
Environment
Phase 3 – execution
(adaptive)
When data processing is in place data and computation can be moved Move data among
devices
Move computation among devices
Monitoring Centralized/distributed
Application deployment
manager
DITAS Execution
EnvironmentDITAS Execution
Environment