the raindrop engine: continuous query processing elke a. rundensteiner database systems research...
Post on 21-Dec-2015
233 views
TRANSCRIPT
The Raindrop Engine:Continuous Query Processing
Elke A. Rundensteiner
Database Systems Research Lab, WPI
2003
2
Monitoring Applications
Monitor troop movements during combat and warn when soldiers veer off course
Send alert when patient’s vital signs begin to deteriorate
Monitor incoming news feeds to see stories on “Iraq”
Scour network traffic logs looking for intruders
3
Properties of Monitoring Applications Queries and monitors run continuously,
possibly unending Applications have varying service
preferences: Patient monitoring only want freshest data Remote sensors have limited memory News service wishes maximal throughput Taking 60 seconds to process vital signs and
sound an alert may be too long
4
Properties of Streaming Data
Possibly never ending stream of data Unpredictable arrival patterns:
Network congestion Weather (for external sensors) Sensor moves out of range
5
DBMS Approach to Continuous Queries Insert each new tuple into the database, encode
queries as triggers [MWA03] Problems:
High overhead with inserts [CCC02] Triggers do not scale well [CCC02] Uses static optimization and execution strategies that
cannot adapt to unpredictable streams System is less utilized if data streams arrive slowly No means to input application service requirements
6
New Class of Query Systems
CQ Systems emerged recently (Aurora, Stream, NiagaraCQ, Telegraph, et al.)
Generally work as follows: System subscribes to some streams End users issued continuous queries against streams System returns the results to the user as a stream
All CQ systems use some adaptive techniques to cope with unpredictable streams
7
Overview of Adaptive Techniques in CQ Systems
Research Work
Technique(s) Goal
Aurora
[CCC02]
Load shedding, batch tuple processing to reduce context switching
Maintain high quality of service
STREAM
[MWA03]
Adaptive scheduling algorithm (Chain) [BBM03], Load shedding
Minimize memory requirements during periods of bursty arrival
NiagaraCQ
[CDT00]
Generate near-optimal query plans for multiple queries Efficiently share computation between multiple queries, highly scalable system, maximize output rate
Eddies [AH00]
(Telegraph)
Dynamically route tuples among Joins Keep system constantly busy, improve throughput
XJoin
[UF00]
Break Join into 3 stages and make use of memory and disk storage
Keep Join and system running at full capacity at all times
[UF01] Schedule streams with the highest rate Maximize throughput to clients
Tukwila [IFF99]
[UF98]
Reorganize query plans on the fly by using using synchronization packets to tell operators to finish up their current work.
Improve ill-performing query plans
8
CAPE Runtime Engine
Runtime Engine
OperatorConfigurator
QoS Inspector
OperatorScheduler
PlanMigrator
ExecutionEngineStorage
ManagerStream
Receiver
DistributionManager
Query PlanGenerator
Stream / QueryRegistration
GUI
StreamProvider
QueriesResults
The WPI Stream Project: Raindrop
9
Topics Studied in Raindrop Project Bring XML into Stream Engine Scalable Query Operators (Punctuations) Cooperative Plan Optimization Adaptive Operator Scheduling On-line Query Plan Migration Distributed Plan Execution
PART I: XQueries on XML Streams (Automaton Meets Algebra)
Based on CIKM’03
Joint work with Hong Su and Jinhui Jian
11
What’s Special for XML Stream Processing?<Biditems>
<book year=“2001">
<title>Dream Catcher</title>
<author><last>King</last><first>S.</first></author>
<publisher>Bt Bound </publisher>
<price> 30 </initial>
</book>
…
<biditems> <book> <title> Dream Catcher </title> …
Token-by-Token access manner
timeline
Pattern retrieval + Filtering + Restructuring
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 20Return <Inexpensive> $t </Inexpensive>
Token: not a direct counterpart of a tuple
30Bt BoundS.KingDream2001
pricepublisherfirstlasttitleyear
Pattern Retrieval on Token Streams
12
Two Computation Paradigms
Automata-based [yfilter02, xscan01, xsm02, xsq03, xpush03…]
Algebraic [niagara00, …]
This Raindrop framework intends to integrate both paradigms into one
13
Automata-Based Paradigm
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 20Return <Inexpensive> $t </Inexpensive>
1book*
2
4title
price
Auxiliary structures for:
1. Buffering data
2. Filtering
3. Restructuring
…
//book
//book/title
//book/price3
14
Algebraic Computation
book bookbook
title author
last first
publisher price
Text
Text Text
Text Text
<book> …</book>
<title>… </title>
<book>… </book>
Navigate$b, /title -> $t
Navigate $b, /price->$p
Navigate $b, /title->
$t
Tagger
Select $p < 30
Logic Plan
Navigate //$b, /title->$t
Rewrite by “pushing down selection”
Navigate $b,/price->$p
Select $p < 30
Tagger
Rewritten Logic Plan
Navigate-Index $b, /price -> $p
Select $p < 30
Tagger
Navigate-Scan $b, /title -> $t
Physical Plan
Choose low-level implementation alternatives
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 20Return <Inexpensive> $t </Inexpensive>
$b
$b $t
…
…
15
Observations
Either paradigm has deficiencies
Both paradigms complement each other
Automata Paradigm Algebra Paradigm
Good for pattern retrieval on tokens Does not support token inputs
Need patches for filtering and restructuring
Good for filtering and restructuring
Present all details on same low level Support multiple descriptive levels (declarative->procedural)
Little studied as query processing paradigm
Well studied as query process paradigm
16
How to Integrate Two Paradigms
17
How to Integrate Two Models? Design choices
Extend algebraic paradigm to support automata? Extend automata paradigm to support algebra? Come up with completely new paradigm?
Extend algebraic paradigm to support automata Practical
Reuse & extend existing algebraic query processing engines Natural
Present details of automata computation at low level Present semantics of automata computation (target patterns)
at high level
18
Raindrop: Four-Level Framework
Semantics-focused PlanSemantics-focused Plan
Stream Logic PlanStream Logic Plan
Stream Physical PlanStream Physical Plan
Stream Execution PlanStream Execution Plan
Abstraction Level
High (Declarative)
Low (Procedural)
19
Level I: Semantics-focused Plan [Rainbow-ZPR02] Express query semantics regardless of store
d or stream input sources Reuse existing techniques for stored XML pr
ocessing Query parser Initial plan constructor Rewriting optimization
Decorrelation Selection push down …
20
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 20Return <Inexpensive> $t </Inexpensive>
<Biditems>
<book year=“2001">
<title>Dream Catcher</title>
<author><last>King</last><first>S.</first></author>
<publisher>Bt Bound </publisher>
<price> 30 </initial>
</book>
…
$S1<Biditems> … </Biditems>
$S1<Biditems>… </Biditems>
$b<book> … </book>
<Biditems> … </Biditems>
<book> … </book>
$S1<Biditems>… </Biditems>
$b<book>… </book>
$p
30
<Biditems> … </Biditems>
<book>… </book>
…
$S1<Biditems>…
</Biditems>
$b<book>… </book>
$p 30
$t Dream
Catcher
<Biditems>… </Biditems>
<book>. .. </book>
… …
NavUnnest $S1, //book ->$b
NavNest $b, /price/text() ->$p
NavNest $b, /title ->$t
Select$p<30
Tagger “Inexpensive”, $t->$r
Example Semantics-focused Plan
21
Level II: Stream Logical Plan
Extend semantics-focused plan to accommodate tokenized stream inputs New input data format:
contextualized tokens New operators:
StreamSource, Nav, ExtractUnnest, ExtractNest, StructuralJoin
New rewrite rules: Push-into-Automata
22
One Uniform Algebraic View
Token-based plan (automata plan)
Tuple-based plan
Tuple stream
XML data stream
Query answer
Algebraic Stream Logical Plan
23
Modeling the Automata in Algebraic Plan:Black Box[XScan01] vs. White Box
$b := //book$p := $b/price$t := $b/title
Black Box
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 20Return <Inexpensive> $t </Inexpensive>
XScan
StructuralJoin$b
ExtractNest $b, $p
ExtractNest $b, $t
White Box
Navigate $b, /price->$p
Navigate $b, /title->$t
Navigate $S1, //book ->$b
24
Example Uniform Algebraic Plan
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 30Return <Inexpensive> $t </Inexpensive>
Tuple-based plan
Token-based plan (automata plan)
25
Example Uniform Algebraic Plan
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 30Return <Inexpensive> $t </Inexpensive>
StructuralJoin$b
ExtractNest $b, $p
ExtractNest $b, $t
Navigate $b, /price->$p
Navigate $b, /title->$t
Navigate $S1, //book ->$b
Tuple-based plan
26
Example Uniform Algebraic Plan
FOR $b in stream(biditems.xml) //bookLET $p := $b/price $t := $b/titleWHERE $p < 30Return <Inexpensive> $t </Inexpensive>
StructuralJoin$b
ExtractNest $b, $p
ExtractNest $b, $t
Navigate $b, /price->$p
Navigate $b, /title->$t
Navigate $S1, //book ->$b
Select$p<30
Tagger “Inexpensive”, $t->$r
27
From Semantics-focused Plan to Stream Logical Plan
StructuralJoin$b
ExtractNest $b, $p
ExtractNest $b, $t
Nav $b, /price/text()->$p
Nav $b, /title->$t
Nav $S1, //book ->$b
Select$p<30
Tagger “Inexpensive”, $t->$r
NavUnnest $S1, //book ->$b
NavNest $b, /price/text() ->$p
NavNest $b, /title ->$t
Select$p<30
Tagger “Inexpensive”, $t->$r
Apply “push into automata”
Apply “push into automata”
28
Level III: Stream Physical Plan
For each stream logical operator, define how to generate outputs when given some inputs Multiple physical implementations may be provided fo
r a single logical operator Automata details of some physical implementation are
exposed at this level Nav, ExtractNest, ExtractUnnest, Structural Join
29
One Implementation of Extract/Structural Join
1book
title*
4price
3
2
ExtractNest$b, $t
ExtractNest/$b, $p
SJoin//book
<title>…</title> <price>…</price>
<biditems> <book> <title> Dream Catcher </title> … </book>…
<price>…</price><title>…</title>
Nav $b, /price->$p
Nav $b, /title->$t
Nav ., //book ->$b
30
Level IV: Stream Execution Plan
Describe coordination between operators regarding when to fetch the inputs When input operator generates one output tuple When input operator generates a batch When a time period has elapsed …
Potentially unstable data arrival rate in stream makes fixed scheduling strategy unsuitable Delayed data under scheduling may stall engine Bursty data not under scheduling may cause overflow
31
Raindrop: Four-Level Framework (Recap)
Semantics-focused PlanSemantics-focused Plan
Stream Logic PlanStream Logic Plan
Stream Physical PlanStream Physical Plan
Stream Execution PlanStream Execution Plan
Express the semantics of query regardless of
input sources
Accommodate tokenized
input streams
Describe how operators manipulate
given data
Decides the Coordination
among operators
32
Optimization Opportunities
33
Optimization Opportunities
Semantics-focused PlanSemantics-focused Plan
Stream Logic PlanStream Logic Plan
Stream Physical PlanStream Physical Plan
Stream Execution PlanStream Execution Plan
General rewriting (e.g., selection push down)
General rewriting (e.g., selection push down)
Break-linear-navigation rewriting
Break-linear-navigation rewriting
Physical implementations choosing
Physical implementations choosing
Execution strategy choosing
Execution strategy choosing
34
From Semantics-focused to Stream Logical Plan: In or Out?
Token-based plan (automata plan)
Tuple-based Plan
Tuple stream
XML data stream
Query answer
Pattern retrieval in Semantics-focused plan
Apply “push into automata”
35
Plan Alternatives
Nav $b, /price->$p
ExtractNest $b, $p
ExtractNest $b, $t
SJoin //book
Select price < 30
Tagger
Nav $b, /title->$t
Nav $S1, //book->$b
ExtractNest $S1, $b
Navigate /price
Select price<30
Navigate book/title
Tagger
Nav $S1, //book->$b
NavUnnest $S1, //book ->$b
NavNest $b, /price ->$p
NavNest $b, /title ->$t
Select$p<30
Tagger “Inexpensive”, $t->$r
Out In
36
Experimentation Results
Execution Time on 85M XML Stream Under Various Selectivity
25000
30000
35000
40000
45000
50000
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Selectivity of Selection
Exe
cutio
n Ti
me
(ms)
1 Nav
2 Navs
3 Navs
4 Navs
5 Navs
37
Contributions Thus Far
Combined automata and algebra based paradigms into one uniform algebraic paradigm
Provided four layers in algebraic paradigm Query semantics expressed at high layer Automata computation on streams hidden at low layer
Supported optimization at an iterative manner (from high abstraction level to low abstraction level)
Illustrated enriched optimization opportunities by experiments
38
On-Going Issues To be Tackled
Exploit XML schema constraints for query optimization
Costing/query optimization of plans On-the-fly migration into/out of automaton Physical implementation strategies of
operators Load-shedding from an automaton
PART II: On-line Query Plan Migration
Joint work with Yali Zhu
40
Motivation for Migration
An Initial Good Query Plan may
become less effective over time:
Changes in stream data distributions (selectivity) Changes in data arrival rates (operator overload) Addition of new queries/de-registering of existing queries Availability of resources allocated to query evaluation Changes in quality of service requirements
41
A Simple Motivating Example
Register a newquery BC
ABCD
AB CD
ABCD
AB CD BC
ABCD
AD BC
Optimized Query Plan
AB CD
X X
AD BC
ABCD
Migration
42
On-line Plan Optimization
Detection of Suboptimality in Plan Query Optimization via Plan Rewriting On-line Migration of Subplan
43
Related Work
“Efficient mid-query re-optimization of sub-optimal query execution plans”, Kabra&DeWitt,SIGMOD’98. (Only re-optimize part of query not started executing yet)
“On reconfiguring query execution plans in distributed object-relational DBMS”, K. Ng, 1998 ICPADS. (plan cloning)
“Continuously Adaptive Continuous Queries over Streams”, S. Madden, J. Hellerstein, UC Berkeley. ACM SIGMOD 2002.
44
Focus on Dynamic Migration
Given a better plan or sub-plan Dynamically migrate running plan to given plan. Guarantee correctness of results
No missing No duplicate No incorrect
45
Join Algorithm: Stateful Operator
Symmetric NLJ For each new A tuple
Purge State B using time-based window constraints W.
Join with tuples in State B
Output result to output queue
Put into State AInput Queue A Input Queue B
Node ABState A State B
Output Queue AB
a1
a1
b1
b1
a1b1
b2a2
a2 b2
a1b2a2b1a2b2
a3
—
a3b2
a3
46
So what’s the problem of migration?
Old states in old plan still need to join with future incoming tuples, cannot be discarded.
New tuples arrive randomly and continuously in streaming system.
ABC
AB
State C
State A State B
State AB
a1a2
b1
a1b1a2b1
c1c2c3
b2
a2b1a2b2
Old Query Plan
BC
ABC
State BC State A
State B State C
New Query Plan
47
Box Concept
Migration Unit = Box Old box contains an old plan or sub-plan New box contains a new plan or sub-plan
Two equivalent boxes Have the same input queues Have the same output queues Contain semantically equivalent sub-plans Can be migrated from one to another
48
But How?
Proposal of two Migration Strategies Moving state strategy Parallel track strategy
Comparison via Cost models Experimental Evaluation
49
Parallel Track Strategy
New plan and old plan co-exist during migration. Run in parallel Share input queues and output queues
Window constraints are used to eventually time out the old states This is when migration stage is over Discard old plan and run only new plan
50
A Running Example
ABC
AB BC
ABC
A B C
State BCState C
State A State B
State A
State B State C
State AB
Output Queue ABC
Input Queues
a1a2
b1
a1b1a2b1
c1
a3
a3b1
a3c2
c2
a3 c2b2
—
a2b2a3b2
b2c2
b2 b2
————
a1b1c1a2b1c1
ta3
c2
b2
a3b1c1a3b2c2
a2
c1
b1
a1
—
a2b2c2
W=3
——
a3b1——a2b2a3b2———— b2c2——
51
Pros and Cons
We don’t need to halt the system in order to do migration Low delay on generating results
Overhead In old plan part, all-new tuple pair is discarded
only at the last node.
52
Moving State Strategy
First, freeze inputs and “drain” out the old plan Then, establish and connect the new plan And, move over all old states to the new states Lastly, let all new input data go to new plan only Resume processing
53
Moving State Strategy
State Matching: Compare states of old and new plans
State Moving: If two states match, move them. State Re-computation: If no match, recompute
state.
54
Abstract Description
ABC
AB BC
ABC
State BCState C
State A State B
State A
State B State C
State AB
a1a2
b1
a1b1a2b1
c1
List of States in Old Plan
A
B
C
AB
B
C
A
BC
List of States in New Plan
55
Intermediate States Sharing
We can share intermediate state BC if:
Inputs for both plans are exactly the same.
Tuples arrived at the same state have passed exactly the same predicates.
Above must hold for any sharing to be possible.
ABCD
ABC
BC
BC AD
ABCD
B
State B State C
State A State BC
State ABC State D State BC State AD
State DState AState CState B
C
A
BC
ABC
B
C
A
D
BC
D AD
56
Moving State Strategy
ABC
AB BC
ABC
A B C
State BCState C
State A State B
State A
State B State C
State AB
Output Queue ABC
Input Queues
a1a2
b1
a3 c2b2
a1b1a2b1
c1
ta3
c2
b2
a2
c1
b1
a1
W=3
a1a2
b1 c1
XX X
X
a3
—c2—b2c2
b2——
b2c2—
a2b2c2a3b2c2
a3b1c1a2b2c2a3b2c2
b1c1
a3b1c1
57
Why Need Two Pointers
Two nodes share the same state may have different contents
Each node has two pointers for each associated state. First: points to the
first tuple in the state Last: points to the
last tuple in the state
ABC
AB BC
ABC
Shared State B
b1b2b3b4
State B seen from AB
b1b2
State B seen from BC
b2b3b4
58
Compare the Two Migration Algorithms Different distribution of tasks between old and
new plans
A B C Query plan used in 1
Query plan used in 2
O O O Old Old
O O N Old New
O N O Old New
N O O Old New
N N O Old New
N O N Old New
O N N Old New
N N N New New
N: new tuple arrives after migration start time.
O: old tuple arrives before migration start time.
New: new query plan is used to compute the result.
Old: old query plan is used to compute the result.
59
Which algorithm performs better? Compare the performances
Which one is faster? Cost-saving? New query plan should outperform old query plan In algorithm 1 old plan part deals with 7 out of 8 cases In algorithm 2 new plan part deals with 7 out of 8 cases
Seems winning However, 2 needs extra cost to re-compute intermediate
states. So cost models are needed!
60
Cost Model Assumptions
All binary NL joins Assume that we already know the statistics of each
node in query plan Input arriving rate Join node selectivity
We compute processing power needed in a period of time during which migration happens. Not computing the real power that the system used. Those two are different because of resource limitation in
a real system.
61
Cost Model Assumptions (cont.) Assume tuple processing time is the same
for tuples of different sizes. Assume when migration starts, the old query
plan has passed its start-up stage and fully running States are at their max size, controlled by
window constraints. Window constraints
Time-based. Same over all streams in a join.
62
Running Example Revisit
Old Query Plan New Query Plan
BC
ABC
State BCState A
State B State C
ABC
AB
State C
State A State B
State AB
a1a2
b1
a1b1a2b1
c1c2c3
b2
a2b1a2b2
63
a, b, c: Tuples/time unit, the average arrival rate on stream A, B and C.
ab, abc_old: The selectivity of node AB and ABC in old query plan.
bc, abc_new: The selectivity of node BC and ABC in new query plan.
W: Window constraints over all joins, time-based, for example 5 time units. t: Any t time units after migration has started.
Cj: Cost of join for each pair of tuples, including the cost of accessing the 2 tuples, comparing their values, and so on.
Cs: Cost of inserting/deleting a tuple to/from states.
Ct: Cost of access a tuple, for example, check a tuple’s timestamp.
|input_name|: Size of the state for one input of a node. For example, |A| represents the size of state A in node AB, and |A| = aW
Symbol Definition
64
Cost Model for Algorithm I – for old plan part Cost for node AB in old plan
State starts fullCAB = cost of purge + cost of insert + cost of join
= [Cs(a/b) b + Cs(b/a) a+ Cs b +Cs a+Cj(a|B|+ b|A|)] t
= [2Cj a bW + 2Cs(a+ b)] t --- formula (1)
AB = (a|B|+ b|A|) ab = 2 abW ab --- formula (2)
NAB = 2 abW abt
Apply the same formula to each node in old query. Put the cost of each node together would be the
total cost.
ABC
AB
State C
State A State B
State AB
a1a2 b1
a1b1 c1c2c3
b2
a2b1a2b2
65
Cost Model for Algorithm I – for new plan part Cost for node BC in new plan
State start empty, no purge needed if t<=W.
CBC = join cost + insert cost
= t2 b cCj + (bt+ ct)Cs , Where t <= W --- formula (3)
In ith time unit after migration start time, and output rate is:
i = (2i – 1) b c bc
The total number generated in anytime t (t <=W) is
NBC = I=[1, t] i = t2 b c bc --- formula (4)
Also apply the same formula to each node in the new plan.
BC
ABC
State BC State A
State B State C
66
Cost Model for Algorithm II
Extra cost needed for computing new states. For our running example, we need to compute state BC.
The extra cost would be:CstateBC = Cj |B||C| =Cj bWcW = W2bcCj
We can apply formula (1) and (2) for computing cost of each node in old plan Because all states are full new states re-computing
Add above two together would be the total cost of algorithm II.
67
Analysis on Cost Models
Several parameters control the performance of the two migration algorithms Arriving rate Join node selectivity Window size Time
Costs may not be linear by time As for algorithm I, new plan part
Total migration time largely depends on window size
Design experiments by varying those parameters.
68
0
2000
4000
6000
8000
10000
12000
14000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Global Window Size (ms)
Mig
ratio
n T
ime
(ms)
T_MS T_PT
69
0
500
1000
1500
2000
2500
0 20000 40000 60000 80000 100000 120000 140000 160000 180000
Time (ms)
Out
put
Rat
e (t
uple
s/se
c)
MS PT New Old
70
Some remaining challenges
Alternate Migration Strategies Selection of Box Sizes Dynamic Optimization and Migration Comparison Study to Eddies
PART III: Adaptive Scheduler Selection Framework
Joint work with Brad Pielech and Tim Sutherland
72
Idea
Propose a novel adaptive selection of scheduling strategies
Observations:1. The scheduling algorithm has large impact on behavior of
system
2. Utilizing a single scheduling algorithm to execute a continuous query is not sufficient because all scheduling algorithms have inherent flaws or tradeoffs
Hypothesis: Adaptively choose next scheduling strategy to leverage
strengths and weaknesses of each and outperform a single strategy
73
Continuous Query Issues
The arrival rate of data is unpredictable. The volume of data may be extremely high. Certain domains may have different service
requirements. A scheduling strategy such as Round Robin, FIFO,
etc. designed to resolve a particular scheduling problem (Minimize memory, Maximize throughput, etc).
What happens if we have multiple problems to solve?
74
Scheduling Example
σ = tuples outputted
tuples inputtedC = Operator processing cost in time unit
• Operator 2 is the quickest and most selective•Operator 1 is the slowest and least selective
• Every 1 time unit, a new tuple arrives in Q1 starting at t0
• When told to run, an operator will process at most 1 tuple, any extra is left in its queue for a later run. • An operator takes C time units to process its input, regardless of the size of the input• An operator’s output size = σ X # of input tuples if an operator inputs 1 tuple and σ = 0.9, it will output 0.9 tuples. • Assume zero time for context switches• Assume all tuples are the same size
σ = 1
C = 0.75
σ = .1
C = 0.25
σ = 0.9
C = 1
Stream
3
2
1
75
Scheduling Example II
Two Scheduling Strategies1) FIFO: starts at leaf and
processes the newest tuple until completion1,2,3,1,2,3, etc.
2) Greedy: schedule the operator with the most tuples in its input buffer1,1,1,2,1,1,1,2,…3
We will compare throughput and total queue sizes for both
algorithms for the first few time units
σ = 1
C = 0.75
σ = .1
C = 0.25
σ = 0.9
C = 1
Stream
3
2
1
76
Scheduling Example: FIFO
FIFOStart at leaf and process the newest tuple until completion.
Time:
End User
σ = 1
C = 0.75
σ = .1
C = 0.25
σ = 0.9
C = 1
Stream
3
1
2
0
0
0
0
3
1
2
1
0
0
0
0
3
1
2
1
0
0
0.9
1
3
1
2
1
0
0.09
0
1.25
3
1
2
2
0.09
0
0
2
3
1
2
2
0.09
0
0.9
3
3
1
2
2
0.09
0.09
0
3.25
3
1
2
3
0.18
0
0
4
• FIFO’s queue size grows very quickly. • It spends 1 time unit processing Operator 1, then 1 time unit processing 2 and 3. • During these 2 time units, 2 tuples arrive in 1’s queue.
FIFO outputs 0.09 tuples to the end user every 2 time units.
Tuples Outputted:
Queue Sizes:
77
Scheduling Example: Greedy
GreedySchedule the operator with the largest input queue.
Time:
End User
σ = 1
C = 0.75
σ = .1
C = 0.25
σ = 0.9
C = 1
Stream
3
2
10
0
0
0
3
1
2
0
1
0
0
0
3
1
2
1
1
0
0
0.9
3
1
2
2
1
0
0
1.8
3
1
2
1
0
0.1
0.8
2.25
3
1
2
1
0
0.1
1.7
3.25
3
1
2
1
0
0.2
0.7
3.5
3
1
2
1
0
0.2
1.6
4.5
3
1
2
1
0
0.3
0.6
4.75
3
1
2
1
0
0.3
1.5
5.75
3
1
2
2
0
0.4
0.5
6
• Greedy’s queue size grows at a slower rate because Operator 1 is run more often• But tuples remain queued for long periods of time in 2 and 3 until their queue sizes become larger than 1’s
• Greedy will finally output a tuple at about t = 16• At t = 16, Greedy will output 1 tuple, by then, FIFO has outputted .72 (.09 x 6) tuples
Tuples Outputted:
Queue Sizes:
78
Scheduling Example Wrap-up
FIFO:
+ Output tuples at regular intervals
- Q1 grows very quickly
- Output rate is low (.045 tuples / unit)
- Does not utilize operators fully: O1 = 1 tuple per run, O2 = 0.9, 03 = 0.09. Max is 1 tuple
Greedy:
+ Queue sizes grow less quickly than FIFO’s
+ Output rate is high: .0625 tuples / unit
+ More fully utilizes operators: each operator will run with 1 tuple each time
- Some tuples will stay in the system for a long time
- Long delay before any tuples are outputted
79
So? What is our point?
A single scheduling strategy is NOT sufficient when
dealing with varying input stream rates, data volume and
service requirements!
80
New Adaptive Framework
In response to this need, we propose a novel technique which will select between several scheduling strategies based on current system conditions and quality of service requirements
81
Adaptive Framework
• Choosing between more than one scheduling strategy can leverage the strengths of each strategy and minimize the use of an strategy when it would not perform as well.
• Allowing a user to input service requirements means that the CQ system can adapt to the user’s needs, not a static set of needs from the CQ system.
82
Quality of Service Preferences
Each Application can specify their service requirement as a weighted list of behaviors that may be maximized or minimized as desired.
Assumptions / Restrictions: One (global) preference set at a time Preference can change during execution. Can only specify relative behavior.
83
Service Preferences II
Input three parameters: Metric: Any statistic that is calculated by the system. Quantifier: Maximize or Minimize the given Metric. Weight: The relative weight / importance of this metric.
The sum of all weights is exactly 1.
Minimize
Maximize
Quantifier WeightMetric
0.40Queue Sizes
0.60Throughput
Current Supported Metrics:
1. Throughput
2. Queue Sizes
3. Delay
84
Adaptive Selection Overview
Given a table of service preferences and a set of candidate scheduling algorithms:
1. Initially run all candidate algorithms once in order to gather some statistics about their performance
2. Assign a score to each algorithm based on how well they have met the preferences relative to the other algorithms (score
formulas on next slides)
3. Choose scheduling algorithm that will best meet the preferences based on how algorithms performed thus far
4. Run that algorithm for a period of time, record statistics
5. Repeat Steps 2-4 until query or streams are over
85
Adaptive Formulas
5.0)min()max(
)(
timeHSi decay
HHz
I
iiiscore wzs
0
))((
Zi- normalized statistic for a preferenceI – the number of preferencesH- Historical Categorydecay- decay factor
“A schedulers score is comprised by summing the normalized statistic score times the weight of the
statistic for each of the defined statistics by the user.”
86
Choosing Next Scheduling Strategy
Scheduler's Score of 4 Algorithms
Algorithm 1
Algorithm 2
Algorithm 3
Algorithm 4
Once the schedulers scores have been calculated, the next strategy has to be chosen.
• Should explore all strategies initially such that it can learn how each will perform
• Periodically should rotate strategies because a strategy that did poorly before, could be viable now
• Remember, the score for the last ran algorithm is not updated, only the other candidates have their score updated
Roulette Wheel [MIT99]
•Chooses next algorithm with a probability equivalent to its score
•Assign an initial score to each strategy such that each will have a chance to run
•Favors the better scoring algorithms, but will still pick others.
87
Experiments
3 parameters :1. Number of streams
2. Arrival Pattern
3. Number of service preferences
88
Experiment Setup
5 different query plans: Select and Window-Join Operators
Incoming streams use simulated data with a Poisson arrival pattern. The mean arrival time is altered to control burstiness
Want to show that Adaptive better meets the preferences than a single algorithm, if not, then the techniques are not worthwhile
89
Single Stream Result (2 Requirements)
Overall Algorithm Performance 70% Output Rate 30% Avg Tuple Delay
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
Time
Wei
gh
ed P
erfo
rman
ce S
core
Adaptive
PTT
MTIQ
The adaptive strategy performs as well as PTT in this environment
90
Single Stream Result (3 Requirements)
Overall Algorithm Performance 20% Memory 60% Output Rate 20% Tuple Delay
0
0.5
1
1.5
2
2.5
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
Time
Wei
gh
ed P
erfo
rman
ce
Sco
re
Adaptive
PTT
MTIQ
91
Multi Stream Result (2 Requirements)
Overall Algorithm Performance 100% Delay
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
Time
Wei
gh
ed P
erfo
rman
ce S
core
Adaptive
PTT
MTIQ
Overall Algorithm Performance 70% Memory 30% Delay
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
TimeW
eig
hed
Per
form
ance
Sco
re
Adaptive
PTT
MTIQ
The Adaptive Framework performs as well as, if not better than both individual scheduling algorithms, with differing service requirements.
92
Multi Stream Result (3 Requirements)
Overall Algorithm Performance 60% Memory 20% Output Rate 20% Tuple Delay
0
0.5
1
1.5
2
2.5
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
Time
Wei
gh
ed P
erfo
rman
ce S
core
Adaptive
PTT
MTIQ
Overall Algorithm Performance 60% Output Rate 20% Memory 20% Tuple Delay
0
0.5
1
1.5
2
2.5
Time
Wei
ghed
Per
form
ance
S
core
Adaptive
PTT
MTIQ
Overall Algorithm Performance 60% Tuple Delay 20% Memory 20% Output Rate
0
0.5
1
1.5
2
2.5
10 30 50 70 90 110
130
150
170
190
210
230
250
270
290
Time
Wei
gh
ed P
erfo
rman
ce S
core
Adaptive
PTT
MTIQ
Overall Algorithm Performance 33% Tuple Delay 33% Memory 33% Output Rate
0
0.5
1
1.5
2
2.5
Time
Wei
ghed
Per
form
ance
S
core
Adaptive
PTT
MTIQ
93
Related Work Comparison
Research Work Comparison
Aurora
[CCC02]
More complex QoS model. Makes use of alternate adaptive techniques
STREAM
[MWA03]
Only Meets memory requirement
NiagaraCQ
[CDT00]
Only adapts prior to query execution. Concerned more with generating optimal query plans
Eddies [AH00]
(Telegraph)
Finer grain adaptive strategy, look to incorporate in future
XJoin
[UF00]
Finer grained technique, look to incorporate in future
[UF01] Only focuses on maximizing rate, uses a single adaptive strategy
Tukwila [IFF99]
[UF98]
Reorganizes plans on the fly, look to incorporate this in the future
94
Conclusions
Identified a gap in existing CQ research and proposed a novel adaptive technique to address the problem.
Draws on genetic algorithms and AI research Alters the scheduling algorithm based on how well the
execution is meeting the service preferences The adaptive strategy is showing some promising experiment
results Never performs worse than any single strategy Often performs as well as the best strategy, and often
outperforms it. Adapts to varying user environments without manually
changing scheduling strategies
95
Overall Blizz Many interesting
problems arise in this new stream context
There is room for lot’s of fun research
96
Email: [email protected]