parallel complex event processing

28
Parallel Complex Event Processing Karol Grzegorczyk 03-06-2013

Upload: karol-grzegorczyk

Post on 29-Jan-2018

1.411 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Parallel Complex Event Processing

Parallel Complex Event Processing

Karol Grzegorczyk03-06-2013

Page 2: Parallel Complex Event Processing

Big Data classification

[http://en.wikipedia.org/wiki/File:3_states_of_data.jpg]

Page 3: Parallel Complex Event Processing

Event-driven architecture

Page 4: Parallel Complex Event Processing

Complex Event Processing solutions

● Open Source:

– Esper

– Drools Fusion

– Storm

– WSO2 Complex Event Processor● Proprietary software

– Oracle Complex Event Processing

– StreamBase Complex Event Processing

– Informatica RulePoint

– TIBCO Complex Event Processing

Page 5: Parallel Complex Event Processing

Esper

● Two editions:

― Open source library

― Enterprise server based on Jetty

● Core component of Esper is a CEP engine.

● CEP engine is working like database turned upside-down

● Expressions are defined in Event Processing Language (EPL)

― Declarative domain specific language

― Similar with the SQL query language but differs from SQL in its use of views

rather than tables and events instead of records (rows)

― Views are reused among EPL statements for efficiency!

select * from OrderEvent.win:length(5)

Page 6: Parallel Complex Event Processing

Streams

● Complex event can be build based on several data streams.select * from AlertEvent as a, NewsEvent as n

where a.symbol = n.symbol

● Esper defines two types of data streams:

― Filter-based event streamselect * from OrderEvent(itemType='shirt')

― Pattern-based event streamselect * from pattern [

OrderEvent(itemType='shirt') -> OrderEvent(itemType='trousers')]

● It is possible to join between filter-based and pattern-based streams!

● Events can be forwarded to others streams using INSERT INTO keywords.

● It is also possible to update event (using UPDATE keyword) before it applies

to any selecting statements

Page 7: Parallel Complex Event Processing

Views

● Events are derived from streams (both filter- and pattern-based) by views

● Default view encloses all events from the stream since addition of the statement to the engine.

● View types:– Data windows (e.g. lenght, time)

– Named windows

– Extension Views (sorted window, rankied window, time-order view)

– Standard views (unique, grouped, size, lastevent)

– Statistics view (univariate, regression, correlation)

Page 8: Parallel Complex Event Processing

Esper processing

● Update listeners and subscriber objects are associated with EPL statements

● By defualt listeners and subscribers are notified when new event that match EPL query arrive (insert stream)

● In addition listeners and subscribers can be notified when some event that match EPL query is removed from the stream (due to the limit of particular window)

Page 9: Parallel Complex Event Processing

[Esper Reference]

Page 10: Parallel Complex Event Processing

Filtering

Esper provides two types of filtering:

● Stream-level filteringselect * from OrderEvent(type= 'shirt')

● Post-data-window filteringselect * from OrderEvent where type = 'shirt'

Page 11: Parallel Complex Event Processing

[Esper Reference]

Page 12: Parallel Complex Event Processing

[Esper Reference]

Page 13: Parallel Complex Event Processing

Stream-level filtering vs post-data-window filtering

select * from OrderEvent(type= 'shirt')

vs

select * from OrderEvent where type = 'shirt'

The first form is preferred, but still sometimes post-data-window filtering is desired:

Select one hundred orders and calculate average price of trousers.

select avg(price) from OrderEvent.win:length(100) where type = 'trousers'

Page 14: Parallel Complex Event Processing

Data Windows

● Basic windows:

― Length window (win:length)

― Length batch window (win:length_batch)

― Time window (win:time)

― Time batch window (win:time_batch)

● Advanced time windows

― Externally-timed window (win:ext_timed)

― Externally-timed batch window (win:ext_timed_batch)

― Time-Length combination batch window (win:time_length_batch)

― Time-Accumulating window (win:time_accum)

― Keep-All window (win:keepall)

― First Length (win:firstlength)

― First Time (win:firsttime)

Page 15: Parallel Complex Event Processing

[Esper Reference]

Page 16: Parallel Complex Event Processing

[Esper Reference]

Page 17: Parallel Complex Event Processing

Scaling Esper

● According to the documentation Esper exceeds over 500 000 event/s on a dual CPU 2GHz Intel based hardware, with engine latency below 3 microseconds average (below 10us with more than 99% predictability) on a VWAP benchmark with 1000 statements registered in the system - this tops at 70 Mbit/s at 85% CPU usage.

● Parallel processing

– Within one machine

- Context partitions

– With multiple machines

- Partitioned stream- Partition by use case

Page 18: Parallel Complex Event Processing

Context

● Context partition – basic level for locking

● By default single context partition

● Context types:

― Keyed Segmented

― Hash Segmented

― Category Segmented

― Non-overlapping context

― Overlapping context

● Nesting context

Page 19: Parallel Complex Event Processing

Keyed Segmented Context

create context ByCustomerAndAccountpartition by custId and account from BankTxn

context ByCustomerAndAccountselect custId, account, sum(amount) from BankTxn

Implicite grouping in select statement.

Page 20: Parallel Complex Event Processing

Hash Segmented Context

Assigns events to context partitions based on result of a hash function and modulo operation

create context SegmentedByCustomerHash coalesce by hash_code (custId) from BankTxn granularity 16 preallocate

context SegmentedByCustomerHashselect custId, account, sum(amount) from BankTxn group by custId, account

No implicite grouping in select statement!

Page 21: Parallel Complex Event Processing

Category Segmented Context

Assigns events to context partitions based on the values of one or more event properties, using a predicate expression(s) to define context partition membership.

create context CategoryByTempgroup temp < 65 as cold,group temp between 65 and 85 as normal,group temp > 85 as largefrom SensorEvent

context CategoryByTempselect context.label, count(*) from SensorEvent

Page 22: Parallel Complex Event Processing

Non-overlapping context

Non-overlapping context is created when start condition is meet and ended when end condition is meet. There is always either one or zero context partions.

create context NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *)

context NineToFive select * from TrafficEvent(speed >= 100)

Page 23: Parallel Complex Event Processing

Overlapping context

This context initiates a new context partition when an initiating condition occurs, and terminates one or more context partitions when the terminating condition occurs.

create context CtxTrainEnter initiated by TrainEnterEvent as te terminated after 5 minutes

context CtxTrainEnter select t1 from pattern [t1=TrainEnterEvent -> timer:interval(5 min) and not TrainLeaveEvent(trainId = context.te.trainId)]

Page 24: Parallel Complex Event Processing

Context nesting

In case of nested contextx the context declared first controls thelifecycle of the context(s) declared thereafter.

create context NineToFiveSegmentedcontext NineToFive start (0, 9, *, *, *) end (0, 17, *, *, *),context SegmentedByCustomer partition by custId from BankTxn

context NineToFiveSegmentedselect custId, account, sum(amount) from BankTxn group by account

Page 25: Parallel Complex Event Processing

Partitioning without context declaration

Grouped data window std:groupwin()

What is the difference between:

select avg(price) from OrderEvent.std:groupwin(itemType).win:length(10)

And

select avg(price) from OrderEvent.win:length(10) group by itemType

?

Page 26: Parallel Complex Event Processing

Parallel processing on multiple machines

● Partitioned stream● Partition by use case

Page 27: Parallel Complex Event Processing

[Esper Enterprise Edition Reference]

Page 28: Parallel Complex Event Processing

Thank you