[undergraduate thesis] final defense presentation on cloud publish/subscribe model for top-k...
TRANSCRIPT
1
Cloud based publish/subscribe model for Top-k matching over
continuous data streamsAuthor: Y.S. Horawalavithana10002103
Supervisor: Dr. D.N. Ranasinghe
U/Graduate Thesis DefenseJanuary 23, 2015
UNIVERSITY OF COLOMBO SCHOOL OF COMPUTINGSCS 4001: INDIVIDUAL PROJECT
2
Overviewβ’ Motivationβ’ Targetβ’ Design & Architectureβ’ Related workβ’ Dynamic Diversificationβ’ Incremental Top-kβ’ Implementationβ’ Evaluationβ’ Conclusionβ’ Future work
3
Motivation β βBig Filterβ
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
4
Boolean publish/subscribe
Drawbacks A subscriber may be either overloaded with
publications or receive too few publications Impossible to compare different matching
publications as ranking functions are not defined, and
Partial matching between subscriptions and publications is not supported.
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
5
Top-k publish/subscribe Expressive stateful query processing systems User defined parameter k restricts the
delivered publications Pub/Sub Matching
Top-k pub/sub scoring or ranking Pub/Sub Indexing
Indexing to support personalized subscriptions Indexing to support continuous Top-k
publications retrieval
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
6
Target
1. How to define an efficient scoring algorithm by integrating query independent & dependent score metrics taken into account? - Relevance, Freshness & Diversity
2. How to adapt existing indexing data structures used in state-of-the-art publish/subscribe systems under
a) large subscription volume, b) high event rate and,c) the variety of subscribable attributes,
to support Top-k matching queries?
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
7
Scope Optimize Top-k Heuristic for specific domain
E-commerce with buyers & sellers Subscriptions & publications follow a pre-defined
data-structure The number of incoming publications follow a
Poisson random variable Retrieve Top-k publications against subscriptions,
not reverse.
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
8
Design & Architecture
Expire
ExpirePublication
Store
SubscriptionStore
SubscriptionIndexing
Relevance Matching
Publication Stream
MatchingPublication
Store
Publication(Relevance
Score)
PublicationIndexing
Top-kContinuous
Diversity
Personalized Subscription
Personalized Subscription
Personalized Subscription
Dissimilarity
Relevancy
EventDelivery
Top-kNotification
Store
Notification
Notification
Notification
Sliding window
9
Related work:General Top-k publish/subscribePub/sub model Subscription Timing
policy Diversity Scoring metric
Subscription Indexing method
Incremental publication
indexingArchitecture
PrefSIENA(Drosou, ACM
DEBS 2009)Preferential subscription
Sliding window
Relevancy + MAXMIN diversity
Subscription covering
Centralized message-brokers
RRPS(Lu, ICCSA 2009) Normal Continuous QoS Centralized
DaZaLaPs(Pripuzi, IS 2012) Normal Sliding
window Relevancy Grid based P2P
Top-k pub/sub(Shraer[Google],
VLDB 2014)Normal Continuous Relevancy +
Freshness Tree based TAAT & DAAT Centralized
Our modelPersonalized subscription
spaceSliding
windowMAXDIVREL
diversityInverted-list
basedHashing based Cloud based
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
10
Sliding window Top-k computation
π1π2π3π4π5π6π7π8π9π10 ....
π1π2π3π4π5π6π7π8π9π10 ....
π1π2π3π4π5π6π7π8π9π10 ....
π5π1
π5π6
π5π9
Top-2Matching publication stream
h=1
h=3
Jumping step(h)
Top-k notifications delivery On-demand Pro-active
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
Expired
Active
Top-k
11
Relevancy: Personalized Subscription space
Carrier = AT&T (0.4) Subscribe
Brand = HTC (0.3)
Storage (0.7)
1.75
1.3
2.3
Carrier = Verizon (0.5)
Storage 32GB (0.2)
2.52
Storage 32GB (0.6)
Brand = HTC (0.3)
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
12
Relevancy: Personalized Subscription space
2
Carrier = Verizon
Storage 32GB
2.5
Carrier = AT&T
Storage
1.75
Brand = HTC
1.3
2.3
Carrier = VerizonColor = WhiteOS = Android
Storage = 16GBBrand = HTC
Subscribe
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
13
Subscription Indexing: Modified opIndex Based on inverted-lists
Posting lists
Two level portioning Attribute posting list Operator posting list
Locate satisfying subscription tuples
Relevancy score By satisfying relations By satisfying subscription tuples
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
14
Freshness When window becomes larger,
Older publications may prevent the newer publications to enter into Top-k results
Lease relevancy scores? But have to re-calculate scores Forward decaying!
Fresh-relevancy score = relevancy score Freshness score
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
15
Diversity: Top-k representative set
Representative Top-kDrawback(without diversity)
What we want(with diversity)
Method to retrieve Top-k publications from matching publications
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
16
MAX* k-diversity problemwhere
1. P = {p1, β¦, pn}2. k β€ n3. d: a distance metric4. f: a diversity function
),(argmax* dSfS
k|S| PS
Find:
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
17
Proposed: MAXDIVREL k-diversity problem
S-Pin relevancy & similarity-dis theminimize,,
Sin relevancy & similarity-dis themaximize,,g
),,(
),,(maxarg),,(argmax*
rdSh
rdS
rdSh
rdSgrdSfS
PS
where
1. P = {p1, β¦, pn}
2. d: a distance metric3. r: a relevance metric4. f: a diversity function
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
18
Formal Definition: MAXDIVREL k-diversity
SPpSpji
i
j
Sppji
i
j
ji
ji
ppdpr
pr
SPrdSh
ppdpr
pr
SrdS
,
,
dominance holds ),()(
)(
||
1,,argmin
ceindependen holds ),()(
)(
||
1,,gargmax
where
1. P = {p1, β¦, pn}
2. d: a distance metric3. r: a relevance metric
Independence condition:
Dominance condition:
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
19
NP-Hardness: Minimum independent-dominating set
π1
π2π3
π4
π5
π£1
π£4
π£3
π£5
π£2
πΌ
π£1
π£4
π£3
π£5
π£2
π£1
π£4
π£3π£2
π£5
π£1
π£4
π£3π£2
π£5
jijiji ppppdppodNeighborho ,| )(
π£1
π£4
π£3π£2
π£5
Publication space
Graph model
Independent, dominating Independent, dominating Independent, dominating Dominating, not independent
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
20
NAΓVE Greedy argmaxπ (ππ)
2
βπ πβπ (π π)
π (π π)Γπ (ππ ,π π)
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
21
Handling streaming publications
π1
π2π3
π4
π5
π£1
π£4
π£3
π£5
π£2πΌ
π6
π£1
π£4
π£3
π£5
π£2π£6
Continuity Requirements1. Durability
an item is selected as diversified in window may still have the chance to be in window if it's not expired & other valid items in window are failed to compete with it.
2. Order Publication stream follow the chronological order We avoid the selection of item j as diverse later, when we already selected an item i which is not-older than j.
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
22
MAXDIVREL continuous k-diversity
π1π2π3π4 .. π ππ π+1 .. .. .. ....
Matching publication stream
π1π2π3π4 .. π ππ π+1 .. .. .. ....
ith window
(i+1)th window
π πβ
π π+1β
MAXDIVREL k-diversity
MAXDIVREL k-diversity
Independence
Dominance
Durability
Order
Straightforward solution: Apply naΓ―ve greedy method at each instance
Propose incremental index mechanism! Avoid the curse of re-calculating neighborhood
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
23
Locality Sensitive Hashing (LSH) Simple Idea
if two points are close together, then after a βprojectionβ operation these two points will remain close together
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
24
LSH Analysis For any given points
β’ Hash function h is (, ) sensitive, β’ Ideally we needβ’ to be largeβ’ to be small
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
25
LSH in MAXDIVREL:Publications as categorical data
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
26
LSH in MAXDIVREL:Characteristic Matrix
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
27
LSH in MAXDIVREL:Minhashing No Publications any more!
Signature to represent
Technique Randomly permute the rows at
characteristic matrix m times Take the number of the 1st row, in
the permuted order, which the column has a 1 for
the correspondent column of publications.
First permutation of rows at characteristic matrix
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
Advantage: Reduce the dimensions into a small
minhash signature
28
LSH in MAXDIVREL:Signature Matrix
Fast-minhashingSelect m number of random hash
functionsTo model the effect of m number of
random permutationMathematically proved only when,
The number of rows is a prime.
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
29
LSH in MAXDIVREL:LSH Buckets
Take r sized signature vectors From m sized
minhash-signature
Map them into, L Hash-Tables Each with
arbitrary b number of buckets
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
31
LSH in MAXDIVREL:Analysis
For two vectors x,y
For publications x & y At a particular hash table
x & y map into the same bucket:
x & y does not map into the same bucket:
At L Hash-tables x & y does not map into the same bucket:
1 βΒΏ
True near neighbors will be unlikely to be unlucky
in all the projections
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
32
LSH in MAXDIVREL:Batch-wise Top-k computation
Bucket βWinnerβ β a publication which has the highest relevancy score
Winner is dominant to represent it's bucket neighborhood
Top-k "winnersβ that have a majority of votes k winners are independent
π π΄π π΅ππΆππ·π πΈπ πΉππΊππ». .
ith window
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
33
LSH in MAXDIVREL:Incremental Top-k computation
πππ€ππ’ππππππ‘ππππ πππππ‘ππ hπ‘ hπ πππππ‘ππππ π‘πππ£πππ‘ππ Characteristic Matrix
πΊππππππ‘ππ hπ‘ h hπππ ππ π πππππ‘π’ππ
Signature Matrix
Map signature into L hash-tables
Update βWinnerβ at bucket signature
maps into
Vote πππβπππππππππ‘π1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
34
LSH in MAXDIVREL:When new publication F arrivesβ¦
Only buckets will vote Follow continuity requirements
Durability Order
π π΄π π΅ππΆππ·π πΈπ πΉππΊππ». .
ith window
(i+1)th window
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
35
Implementation
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
36
Cloud service modules
Source: Amazon Kinesis Source: Amazon Elastic-cache
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
37
Publication Stream Zipfian subscriptions
Normalized preferences
Evaluation:Dataset
Amazon on-line market place data available at 17th β 19th November 2014
N - number of elements in distribution,
k - rank of element
s - value of exponent
38
Evaluation:Methodology
Subscriber Effectiveness
Performance & Efficiency
Quality
Accuracy
Resiliency
Freshness
Index construction time
Top-k matching time
Platform: Amazon AWS Linux based micro-node instances
Each with 2.3 GHz, 8GB memory
Algorithms are implemented in Java
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
39
Subscriber Effectiveness:Quality or natural behvior
Testing zipf or power law hypothesis on distribution of ranked results (KS Test)i. Fitting power lawii. Goodness of fit testsiii. Alternative distributions
Compute 19030 ranked distributions over 100K publication stream
Under different subscriber views Under different sized sliding window
instances
Sample distribution of ranked voteslo
g z
ipf_
prob
(ran
k)log (rank)
N - number of elements in distribution,
k - rank of element
s - value of exponent
40
Subscriber Effectiveness:i. Fitting power law
Zipf exponent values
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
41
Subscriber Effectiveness:i. Fitting power law
Illustration of Zipf exponent values convergence
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
42
Subscriber Effectiveness:i. Fitting power law
Zipf exponent values under different similarity threshold
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
43
Subscriber Effectiveness:ii. Goodness of fit tests
P-values of KS test under different subscriber views
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
44
Subscriber Effectiveness:iii. Testing alternative distributions
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
45
Subscriber Effectiveness:Other diversity based methods
P-dispersion problem
MAXMIN
MAXSUM
Minimum independent-dominating set
problem
MAXDIVREL
DisC
For an even comparison,Combine relevancy at all diversity methodTo achieve a bi-criteria objective
Average zipf law exponent in a comparison with other methods
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
46
Subscriber Effectiveness:Other diversity based methods
P-dispersion problem
MAXMIN
MAXSUM
Minimum independent-dominating set
problem
MAXDIVREL
DisC
A comparison of average zipf law exponent with other methods
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
47
Subscriber Effectiveness:Accuracy of Top-k results
LSH Index vs. NAΓVE Rank probability Diversity probability
Accuracy on similarity threshold = 0.55 Accuracy on similarity threshold = 0.85
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
48
Subscriber Effectiveness:Resiliency of Top-k results
Getting Top-k publications (Unordered) Getting Top-k publications (Ordered)
49
PerformanceSubscription index update time
Index construction time on opIndex vs. modified opIndex
opIndex vs. modified opIndex
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
50
Efficiency:Initial matching time at modified opIndex
Initial matching time under different size of subscription spaces Initial matching time under different size of publications
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
51
Performance & Efficiency:LSH Index
BLSH index construction + update time on different number of minhash functions
Number of minhash functions (m) =
How much accuracy do we sacrifice by comparing small minhash signatures?
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
52
Performance & EfficiencyILSH vs. BLSH vs. NAΓVE
π1π2π3π4π5π6π7π8. .BLSH
or NAIVE
BLSH or
NAIVE
BLSH or
NAIVE
BLSH or
NAIVE
ILSH
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
53
Performance & Efficiency:BLSH vs. NAΓVE
log (Ranking time) on number of publications with D=250 log (Ranking time) on number of publications with D=500
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
54
Performance & Efficiency:ILSH vs. BLSH vs. NAΓVE
log (Ranking time) on number of publications with D=250 log (Ranking time) on number of publications with D=500
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
55
Conclusions Diversified results produced by MAXDIVREL based on independent-
dominating set problem Exhibits strong natural behavior other than, Methods based p-dispersion problem
Relevancy is a important factor to employ In distance based diversity methods Always has the tendency to produce the diverse set of personalized
results Absolute ranks are sensitive to the preference value
While keeping the deviation small among relative ranks
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
56
Conclusions (Ctd.) Locality Sensitive Hashing (LSH) indexing method
Produce MAXDIVREL diverse set of results at average 70% accuracy over naΓ―ve method
Reduce the matching time very significantly over NAΓVE method Further, refine by itβs incremental version
For handling streaming publications Avoid the curse of re-computing neighborhoods
No such k to restrict the delivery of Top publications Given a window size & delivery method Model can produce best diverse set of personalized results
To represent the set of all matching publications at given instance1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
57
Major Contributions Dynamic diversification method based on independent-dominating set
problem We introduced a novel diversity definition based on representative
neighborhoods, called MAXDIVREL k-diversity employing relevancy.
Index based diversification approach to rank results incrementally We proposed a novel, hashing based index approach to solve
MAXDIVREL continuous k-diversity problem based on Locality Sensitive Hashing (LSH) technique
Advanced evaluation method to measure the quality of diverse results First significant try to model natural behavior of diversity methods in
pub/sub community1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
58
Future work Explore other suitable use-cases to apply proposed model & develop
prototype applications, E.g. Personalized newspaper for every Facebook user Diverse set of personalized Twitter trends Social annotation of news-stories
Exploit overlap among diversified results of users who have similar interest
Employ existing implicit methods to extract human preferences E.g. click stream analytics
Develop LSH based index over multi-threaded distributed environment
1.Motivation 2.Target 3.Design & Architecture 4.Related work 5.Dynamic Diversification 6.Incremental Top-k 7.Implementation 8.Evaluation 9.Conclusion 10.Future Work
60
AppendixFreshness
Mean delay between publications = 5000msA comparison between relevancy scores after influenced by freshness
61
AppendixNAΓVE Ranking time
Average naΓ―ve Top-k matching time in comparison with size D of publications
62
AppendixBLSH Ranking time
Average BLSH Top-k matching time in comparison with size D of publications