a formalization of complex event stream processing
DESCRIPTION
Information systems in general, and business processes in particular, generate a wealth of information in the form of event traces or logs. The analysis of these logs, either offline or in real-time, can be put to numerous uses: computation of various statistics, detection of anomalous patterns or compliance violations of some form of contract. However, current solutions for Complex Event Processing (CEP) generally offer only a restricted set of predefined queries on traces, and otherwise require a user to write procedural code to compute custom queries. In this presentation, we present a formal and declarative language for the manipulation of event traces.TRANSCRIPT
A Formalization ofComplex EventStream Processing
SYLVAIN HALLÉSIMON VARVARESSOSSYLVAIN HALLÉSIMON VARVARESSOS
ContextContext
The execution of information systemsproduces events
EventsEvents
An event is an element e taken from someset E, called the event type
Booleans
B
Numbers
R
234
π
Strings
S
abc
Functions
X Y→
Sets
X2
Primitivetypes
Compositetypes
A sample logA sample log
A file (or stream) of events
[10:24:31] INFO Game starts[10:24:33] WARN Lemming into Blocker...[[10:25:01] DEBG Lemming into Floater, id: 32, x: 320, y: 67 ; id: 31, x: 450, y: 43 ; id: 23, x: 229, y: 40 ; ... ...
Each event has one or more data elements
Actual (physical) format not relevantfor us
Searching the logSearching the log
Select AVG(closingPrice)From ClosingStockPricesWhere stockSymbol = `MSFT'for (t = ST; t < ST+50, t+= 5) { WindowIs(ClosingStockPrices, t - 4, t);}
ProblemsProblems
Formal languages (e.g. logic, automata)focus on event ordering; not so good atperforming computations over events
Complex Event Processing often reducesto a thin layer over custom proceduralcode
Goal: provide a formal andnon-procedural framework forthe processing of event streams
TracesTraces
An event trace (or event stream) is a potentiallyinfinite sequence of events of a given type:
2 0 6 34 9 . . .
Traces are symbolically denoted by:
e = e0 e1 e2 e3 ...
The set of all traces of type T is denoted as:
T*
ProcessorsProcessors
A processor is a function that takes 0 or moreevent traces as input, and returns 0 or 1event trace as output
1 : 1 processor
2 : 1 processor
. . . . . .
A high-level event trace can be produced bycomposing ("piping") together one or moreprocessors from lower-level traces
CompositionComposition
Processor algebraProcessor algebraGoal: come up with a "toolbox" of basicprocessors sufficient to perform variouscomputations over traces
??
A few useful functionsA few useful functions
Identity function: returns an event if given one,or t if passed the empty event ε
ιt(x) = { t if x = εx otherwise
+ (x) = {x}Wrap function
-({x}) = xPeel function
Path function: returns subtree at endof path π
/π
SemanticsSemanticsProcessors can be defined formally bydescribing how their output trace is createdfrom their input trace(s)
e0, ..., en : φ(x0 , ..., xn)
Input trace(s)
Symbolic variables:xi refers to the i-th trace
on the left
Constants as processorsConstants as processors
Any element t of type T can be lifted as a0 : 1 processor producing the infinite tracet t t t ...
t . . .t t
The constantprocessor t e : t = t t t ...
Input/outputInput/output
0 : 1 processors can be used to produce anevent trace out of an external source (i.e.standard input, a file, etc.)
Ditto for 1 : 0 processors
a . . .b
a . . .b
MutatorMutator
Returns t, but only as many times as the number of events received so far
i.e. "mutates" input events into t
tte
Functions as processorsFunctions as processors
Any n-ary function f defined on individual events can be lifted to an n:1 processor ontraces, by applying it successively to n-uples
2 0 6. . .
3 8 1
+ 7 8 5
. . .
. . .
Functions as processorsFunctions as processors
Any n-ary function f defined on individual events can be lifted to an n:1 processor ontraces, by applying it successively to n-uples
e0, e1 : x0+x1
e00+ e10
e01+ e11
, e02+ e12
, , . . .
=
FreezeFreeze
Returns the first event received, upon everyevent received
abb. . . a a a . . .
e : x = e0 e0 e0 ...
DelayDelay
Returns every the input trace, starting from itsn-th event
abc. . . b . . .
e : x = en en+1 en+2 ...
2
n
e : xn=
c
DecimateDecimate
Returns every n-th event of the input trace
abc. . . a . . .
e : x = e0 en e2n ...
2
n
cΨ
Ψ
e : x= nie : xnΨ
i
COMPLEX PROCESSORS
WindowWindow
Simulates the application of a "slidingwindow" to a trace
Takes as arguments: another processor φand a window width n
Returns the result of φ after processingevents 0 to n-1...
Then the result of (a new instance of) φthat processes events 1 to n...
...an so on
Υ φn
Example: execution of the processoron the trace
WindowWindow
2 1 5 0Υ ++2
Υ2
2 12 1 2 12 3
2 11 5 2 11 6
2 15 0 2 15 5
2 1 5 0 3 6 5
WindowWindow
The window processor can take anyprocessor as an argument...
...i.e. the sliding window can be applied toanything.
Formally:
e : φ e : φ= n-1iΥn i
FilterFilter
Discards events from an input trace basedon a selection criterion
Takes as argument another processor φ
Evaluates φ on the trace that starts at event0; returns that event if the first eventreturned by φ is T
Same process on the trace that starts atevent 1...
...an so on
Φ φ
Example: execution of the processoron the trace
FilterFilter
2 1 5 0Φ∈2IN
Φ2 1 5 0 2 0
∈2IN
∈2IN2 1 5 0
FilterFilter
The filter can take any processor as anargument...
...including a processor that requires multipleinput events before outputting something
Formally:
e : φ e : φ=Φ 1 ΦΦ(e, φ) ,
Φ(e, φ) = { e0 if
no event otherwise
e : φ = T0
SpawnSpawn
Cumulative combination of a processor'soutput for every suffix of a trace
Creates one new instance of processorφ upon every new input event
Feeds each input event to all existinginstances of φ
Combines the value returned by eachinstance using function f
...and outputs it
Σ φf
Example: execution of the processoron the trace
SpaweSpawn
2 1 5 0Σ+
Σ+2 1 5 0 2 3 8
x
x
x
8
2 1 5 0 2 1 5 0
x1 5 0 1 5 0
x5 0 5 0
++
+
SpawnSpawn
Formally:
e :
e :
=1
Σ φf
e : φ 0 , f ( Σ φfe : φ 0 , e : φ 0 ,e : φ 0 , )Turns out to be a powerful device; dependingon φ and f, can provide many usefulprocessors...
SpawnSpawn
Count events Σ 1+
Cumulative sum Σ+
Set of all events Σ∪ +
= #= ++=∪
These processors can be freely composed
Compute the statistical moment of order n
CompositionComposition
n
Σ+
Σ+1
÷
These processors can be freely composed
Compute the statistical moment of order n
CompositionComposition
n
Σ+
Σ+1
÷
= #
These processors can be freely composed
Return sum of two successive events,only if it is greater than 5
CompositionComposition
++
Υ2Φ
> 5
Linear Temporal LogicLinear Temporal Logic
Eventually Σ (φ )∨
Next φ
...etc (see the paper)
= F φ = X φ
Operators for Linear Temporal Logic canalso be defined
3
All together nowAll together now
All together nowAll together now
Count pairs of successive events that aremore than one standard deviation fromthe mean
E(X)
-
All together nowAll together now
Count pairs of successive events that aremore than one standard deviation fromthe mean
σ
E(X)
-
÷
All together nowAll together now
Count pairs of successive events that aremore than one standard deviation fromthe mean
σ
E(X)
-
÷ Φ
> 1
All together nowAll together now
Count pairs of successive events that aremore than one standard deviation fromthe mean
σ
E(X)
-
÷
X
Φ
> 1
Φ∧
All together nowAll together now
Count pairs of successive events that aremore than one standard deviation fromthe mean
#σ
E(X)
-
÷
X
Φ
> 1
Φ∧
AdvantagesAdvantages
No imperative constructs
No restrictions on what can be piped towhat (modulo type compatibility)
Streaming operation: outputs producedas inputs are being consumed
Implicit handling of buffering, duplication,etc.
Demo!Demo!
Prototype implementation in Java
In this example, handles 100 events/sec.
Go see it on YouTube: http://goo.gl/QoS8Dy
GAME OVER
GAME OVER
YES NO
QUESTIONS ?
GAME OVER
YES NO
QUESTIONS ?