near real-time big-data processing for data driven applications
TRANSCRIPT
1
Near real-time big-data
processing for data driven
applicationsJānis Kampars, Jānis Grabis
Institute of Information Technology
Riga Technical University, Riga, Latvia
22
Background and objectives
Conceptual model
Architecture and technologies
Sample use case
Conclusion
Outline
2
33
Development of context-aware adaptive
applications
Context as business process execution driver
Background
3
FP7 project
Capability
Driven
Development
(CaaS)
44
To develop a platform for context-
dependent adaption of data-driven
applications
– Externalized context processing and adaption
logics
– Model-driven and horizontally scalable
Objective
4
66
Conceptual Model
6
class AutoScale
EntityEntity Relation
Context Prov ider
Measurable Property
Dimenssion ValueSchema
Archiv ing Specification Context Element
Context Calculation Adjustment Adjustment Trigger
Context Element Range
0..*
defines
1
1
1
1..* 1..*
11
2
relates
0..*
0..*
measures
0..*
1
0..* 0..*
uses
0..*
0..*
uses
0..* 1
Calculates1
1..*
trigers
1
1
takes value from
0..*
1..*uses
1..*
77
Key Elements
7
• Data affecting process execution
Context elements
• Data items characterizing the domain
Entities
• Capturing of physical context
Context providers and measurable
properties
• Adaptive actions due to context change
Adjustments
99
Overview of Architecture
9
CDP
Cassandra cluster
Cassandra 2 Cassandra N
Spark cluster
Spark 2 Spark NSpark 1
Cassandra 1
MP archiving jobCE calculation
jobAdjustment
triggering job
Da
ta-d
rive
n s
yste
m
Ad
just
me
nt
en
gin
e
Ad
just
me
nt 1
Ad
just
me
nt 2
Ad
just
me
nt N
Kafka proxy cluster
Proxy 1 Proxy 2 Proxy N
Kafka cluster
Kafka 1 Kafka 2 Kafka N
Raw MP data
Raw MP data 63
Agg
rega
ted
MP
dat
a
4
CE
dat
a
7
CE data 8 Trigger adjustment
95
Trigger adjustment
10
Per
form
ad
just
me
nt
11
Raw MP data
1
2
ASAPCS core + UI
1010
Spark jobs– Aggregation and archiving of measurable properties
– Context element calculation
– Adjustment triggering
Jobs are created according to the entity model
Spark integration
10
Entity model
Compu-tations
Measurable
properties
Context elements
Adjustment triggers
Dockercontain
ers
1111
Adjustments are placed and executed in
dedicated Docker containers
Docker Integration
11
Entity modelAdjustment specification
Docker container
13
Data storage problem
– Data is stored on disks that
are located on data nodes
– Data centers belong to
specific geographic regions
– Disk health is measured by
write errors, read errors,
temperature and bad sectors
– Data center region safety is
measured by nature hazards,
security incidents or terrorist
attacks
Additional data replication is
required to deal with security risks
Identification of context dependent variations in the data-driven application
Specification of potential context providers
Definition of relevant entities and measurable properties
Creation of context elements and their calculations
Implementation of adjustments associated with the context elements defined
Deployment of the solution
Operation (context data integration and execution of adjustments)
13
Model-based Infrastructure Management