middleware systems research group 20071 denial of service in content-based publish/subscribe systems...
TRANSCRIPT
2007 1
MIDDLEWARE SYSTEMSRESEARCH GROUP
Denial of Service in Content-based Publish/Subscribe Systems
M.A.Sc. Candidate: Alex WunThesis Supervisor: Hans-Arno Jacobsen
Department of Electrical and Computer EngineeringDepartment of Computer Science
University of Toronto
v0.4
2007 2
RESEARCH GROUPMIDDLEWARE SYSTEMS
Background Context of Thesis Work
PADRES middleware platform Content-based Publish/Subscribe (CPS) Originally inspired by distributed dashboard and job
scheduling requirements Increasingly motivated by enterprise application
integration
Need to investigate different facets of security for CPS systems Security amongst top concern in many application
scenarios
2007 3
RESEARCH GROUPMIDDLEWARE SYSTEMS
Contributions of Thesis Work
DoS Characteristics
AttackTaxonomy
AttackExperiments
DoS Resilience
CommonalityModel
MatchingAlgorithm
DoS Prevention
PolicyModel
PolicyFramework
2007 4
RESEARCH GROUPMIDDLEWARE SYSTEMS
Content-based Publish/Subscribe
S S
P
Publishers
P
Subscribers
BrokerNetwork
Subscrip-tions
Publication(Tuple)
Subscriptions(Boolean Functions)
Storing Filters(Functions)
[(event=prescription), (age>50)]
[(event,prescription), (patientID,123), (age,63), (drug,X) …]
[(event=prescription), (drug=Y)]
“Matching”
2007 5
RESEARCH GROUPMIDDLEWARE SYSTEMS
Matching Performance Optimizations
Often based on exploiting similarities (overlap) between subscriptions Avoid unnecessary subscription and predicate
evaluations
Can we abstract these optimizations? Formalize content-based Matching Plans (order of
subscription and predicate evaluations) Quantify performance of existing optimizations Discover future potential optimizations
2007 6
RESEARCH GROUPMIDDLEWARE SYSTEMS
Commonality Model
}{ 1 mSS
CSS m 1
For a subscription set
mSSC 1
or
DisjunctiveCommonalityExpression
ConjunctiveCommonalityExpression
A set of commonality expressions is a subscription topology.
• Per-Link Matching• DNF Subscriptions
• Shared predicates• Clustering on subscription classes or attributes• “Pruning” strategies (e.g., number of attributes)
2007 7
RESEARCH GROUPMIDDLEWARE SYSTEMS
Example: Link-Group Topology
LSS m 1
PP
PP
PSPSPL
mmnm
n
m
1
111
1
1
CSS m 1
NNO ln
Depth First Algorithm to determine probabilistically optimal matching plan [Greiner2006] in
8
Example: Link-Group Topology
Low Selectivity
X X
High Selectivity
o
o
9
Example: Cluster Topology
• Dramatic scalability effects of clustering in CPS• Observed trend depends on proportion of commonalities not number of predicates . . .
X
o
Simulation Experimental (in PADRES)
2007 10
RESEARCH GROUPMIDDLEWARE SYSTEMS
Extended Implication Relationships
21 SS
)]4(),3[(1 baS21
21 SS
)]0[(2 aS
)9(1 a)3(2 a
Between subscriptions
Between predicates
21 CC )3(1 tuplesC
)5(2 tuplesC
Between commonalities
2007 11
RESEARCH GROUPMIDDLEWARE SYSTEMS
Simple Implication Expressions
321)0( SSSa
aLS 1
321 SSSLa
)5()3()0( aaa
)2()1()9( aaa
)3()10()10( aaaMixed operatorlists currently notsupported
2007 12
RESEARCH GROUPMIDDLEWARE SYSTEMS
Matching Engine Architecture
…
Shared pred. index(conj. comm.)
…
Subscription index
…
…
All predicates index
Predicate pool Subscription pool
Overlay links(disj. comm.)
Map
Sorted List (Map)
Node elements
2007 13
RESEARCH GROUPMIDDLEWARE SYSTEMS
Matching Engine Architecture
True
False
D.C.
True
False
D.C.
Node Element
• Subscription• Predicate• Overlay link• (conj. comm.)• (DNF subs)
Implication Lists
Node Elements
2007 14
RESEARCH GROUPMIDDLEWARE SYSTEMS
Subscription InsertionPredicate Insertion
…
Shared pred. index(conj. comm.)
…
Subscription index
…
…
All predicates index
Predicate poolConj.
Comm.Subscription pool
Overlay links(disj. comm.)
Unknown predicate prioritiesdefault to head of list
2007 15
RESEARCH GROUPMIDDLEWARE SYSTEMS
Subscription InsertionImplication List Update
a
>
3 4 5 6 7 98
P’s True -> True list
P
3 4 5 6 7 98
Xi’s False -> False list
3 4 5 6 7 98
P’s False -> False list
2007 16
RESEARCH GROUPMIDDLEWARE SYSTEMS
Performance Experiments
Generated subscription workloads from ~50 to ~200,000 predicates {5,10,15,20} Avg. Predicates x
{10,100,1000,10000} Subscriptions4 Different subscription topologies
Low/High clustering (5/200 classes) Low/High sharing (subscription overlap)
Randomly generated and matched 100 publications
17
Low Sharing High Sharing
HighCluster
LowCluster
18
Low Sharing High Sharing
HighCluster
LowCluster
2007 19
RESEARCH GROUPMIDDLEWARE SYSTEMS
Cross-cluster Attributes
]),5(),[( 111 acclassS
]),10(),[( 122 acclassS
]),2(),[( 113 acclassS
2007 20
RESEARCH GROUPMIDDLEWARE SYSTEMS
Cross-cluster Attributes
]),5(),[(111
1
cacclassS
]),10(),[( 122 2 cacclassS
]),2(),[( 113 1 cacclassS
21
Low Sharing High Sharing
HighCluster
LowCluster
22
Low Sharing High Sharing
HighCluster
LowCluster
2007 23
RESEARCH GROUPMIDDLEWARE SYSTEMS
Conclusions
Model captures many existing and potential optimization techniques Implication list approach significantly reduces number of predicate
evaluations in all workloads Superior for expensive predicates
Implementation trade-off: Control cascade overhead/usage Cluster/Index implication lists as well Optimize iteration over marked nodes Additional clustering/indexing beyond only event class
Future work Additional conjunctive/disjunctive commonalities, implication relationships? Implication relationships relevant to message distribution? Rule-based implementation of implication/commonality algorithm?
Thank You – Questions?
2007 24
MIDDLEWARE SYSTEMSRESEARCH GROUP
*** Extra Slides ***
25
High clustering, High sharing
26
Low clustering, High sharing
27
Low clustering, Low sharing
28
High clustering, Low sharing
2007 29
RESEARCH GROUPMIDDLEWARE SYSTEMS
Publication matchingCommonality Phase
…
Shared pred. index(conj. comm.)
…
Subscription index
…
…
All predicates index
Predicate pool Subscription pool
Overlay links(disj. comm.)
Termination Condition:All overlay links
have been decided
Iterate and evaluatewhile TC is false
2007 30
RESEARCH GROUPMIDDLEWARE SYSTEMS
Publication MatchingImplication Cascade
True
False
D.C.
True
False
D.C.
If not alreadydetermined,
Evaluate
Cascade and Mark
True
TrueFalseD.C.
“Advanced” implications handled with a method call triggered by state change(e.g. Predicate becomes true, calls countTruePredicate() on subscriptions)
2007 31
RESEARCH GROUPMIDDLEWARE SYSTEMS
Publication MatchingSubscription Phase
…
Shared pred. index(conj. comm.)
…
Subscription index
…
…
All predicates index
Predicate pool Subscription pool
Overlay links(disj. comm.)
Iterate and evaluatewhile TC is false
+ Cascade and Mark+ Cascade and Count
2007 32
RESEARCH GROUPMIDDLEWARE SYSTEMS
Publication MatchingCleanup Phase
There is no cleanup phase A counter (Vm) is incremented at the start of each
publication matching phase All determined results are versioned (Vd) A determined result is stale if Vd < Vm
To avoid overflow, reset counter every: 64bit counter ~= 16x10^18 pubs @1000 pub/s ~ 16x10^15 s ~32x10^6 s/year ~ 0.5x10^9 years
2007 33
RESEARCH GROUPMIDDLEWARE SYSTEMS
Publication MatchingSorted Lists
Commonality/predicate lists sorted by (p+1/N) p is the predicate selectivity N is the number of subscriptions sharing the predicate
Subscriptions sorted by (1-p)n p is average predicate selectivity n is number of predicates
Predicate hash sorted by predicate value
Commonality/predicate/subscription sorting is meant to be extendable with different priority equations Include predicate cost, length of implication lists, etc …
2007 34
RESEARCH GROUPMIDDLEWARE SYSTEMS
Low Sharing High Sharing
HighCluster
LowCluster
2007 35
RESEARCH GROUPMIDDLEWARE SYSTEMS
Low Sharing High Sharing
HighCluster
LowCluster
2007 36
RESEARCH GROUPMIDDLEWARE SYSTEMS
Tables
Query(Boolean Function)
DB Rows(Tuples)
Subscrip-tions
Publication(Tuple)
Subscriptions(Boolean Functions)
Storing FunctionsStoring
Data
DatabasesContent-based
Publish/Subscribe
Inverse Problems
QueryPlans
MatchingPlans?
ScalablePerformance