distributed top-k query processing · 7/18/13 6...

7/18/13

1

Distributed Data Management Summer Semester 2013

TU Kaiserslautern

Dr.-‐Ing. Sebas9an Michel

[email protected]‐saarland.de

Distributed Data Management, SoSe 2013, S. Michel 1

(DISTRIBUTED) TOP-‐K QUERY PROCESSING

Lecture 11


Top-‐k Rankings

Distributed Data Management, SoSe 2013, S. Michel 3 3


Rankings and Top-‐k Queries

•  With humans in the loop, it is essen9al to bring large amounts of informa9on into a ranked form.

•  To focus on the essence of data (top ranked pieces, according to criteria).

•  E.g., query results in Web search

•  Then: efficient computa9on of top-‐k results; not full computa9on and sort


Overview of Today’s Lecture

•  Top-‐K Queries Founda9ons/Model •  Threshold Algorithms

– With or without Random Accesses – Approximate variant with guarantees

•  Distributed Algorithms – Three phase uniform threshold algorithm – Op9miza9ons


7/18/13

2

Computa9onal Model: Index Lists •  We assume data is stored in so called index lists. One per a_ribute.

•  Each stores item iden9fier (id) and a score of an item with respect to the list’s a_ribute


Id Score

d12 0.83

d81 0.81

d43 0.68

d18 0.62

… …

•  Inside each list, entries are sorted by score in descending order

Computa9onal Model: Aggrega9on

•  Given m such lists. •  The task is to efficiently compute the k items with highest aggregated score (the top-‐k result)

•  That is, applica9on of aggrega9on func9on on an item’s scores across the m lists – E.g., summa9on


Example Applica9on 1 •  List with visual features (color, shape) of objects: red, blue, round, rectangular, …

•  Query: Find the top-‐2 red and round objects


Id Score

E 0.8

B 0.6

D 0.3

A 0.25

C 0.19

Id Score

D 0.8

B 0.75

A 0.6

C 0.25

E 0.05

color=red shape=round

Result: Id Σ Score B 1.35 D 1.10

More Example Applica9on •  Web search engine

– One index list for each term – Query = {term1, term2, …, termM} – Aggrega9on query finds best match documents – Scores computed for each list aier TF*IDF (term frequency * inverse document frequency)

•  Access log mining – One list per Web server. Score = bytes downloaded – Find IP addresses (clients) that caused largest aggregated amount of bytes downloaded


Example Network Server Logs


IP Bytes in kB

192.168.1.7 31kB

192.168.1.3 23kB

192.168.1.4 12kB

IP Bytes in kB

192.168.1.8 81kB

192.168.1.3 33kB

192.168.1.1 12kB

IP Bytes in kB

192.168.1.4 53kB

192.168.1.3 21kB

192.168.1.1 9kB

IP Bytes in kB

192.168.1.1 29kB

192.168.1.4 28kB

192.168.1.5 12kB

Computa9onal Model: Access Models •  Sequen9al Access: Read content of an index list top-‐down

•  Random Access: Lookup score of a specific item inside an index list


Id Score

D 0.8

B 0.75

A 0.6

C 0.25

E 0.05

7/18/13

3

Essence of Top-‐K Algorithms


!i score(i,a) " score(i,b)( )# aggr(a) " aggr(b)

•  Compute top-‐k results without exhaus9ve access to index lists.

•  Family of threshold algorithms: Compute early termina9on point using score threshold.

•  Require monotonicity of aggrega9on func9on: – Aggr is monotone if for all two items a and b

where denotes score of item in index list i score(i, !)

Monotonicity Explained •  Means: if item a is below item b in all considered index lists (that is, its score is smaller), it cannot have a final score higher than the one of b.

•  Assume we have read already from the lists and have the following situa9on


read un9l here

All items seen already in all three lists have aggr. score higher than red-‐dot item

History •  Family of threshold algorithms •  First by

– Fagin, 1999 – Nepal/ Ramakrishna, 1999 – Güntzer/Balke/Kießling, 2001

•  Various versions exist


Fagin’s Algorithm (FA) 1.  Read sequen9ally from each list (round

robin) un9l observa9on of k dis9nct items, each seen in all lists.

Example of top-‐1 query:


document score

doc3 17

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Fagin’s Algorithm •  Have seen doc3 with score 17+7+12 = 36 •  Par9ally seen: doc4 (12+15=27), doc1 (9+19=28), doc2 (11+2=13)

•  Can we stop here already?


document score

doc3 17

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Fagin’s Algorithm 2.  Lookup missing scores in tail of lists (not in => 0) •  Then doc4 (12+0+15=27), doc1 (0+9+19=28), doc2 (11+2+2=15)

•  Done! Why?


document score

doc3 17

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

7/18/13

4

Correctness •  Due to monotonicity: Items not seen now are below doc3 in all lists, hence, aggregated score is also lower.

•  See Ronald Fagin et al.: Op9mal aggrega9on algorithms for middleware. J. Comput. Syst. Sci. 66(4): 614-‐656 (2003)

for overview of threshold algorithms (FA+next ones). •  Also more to this: instance op9mality


Threshold Algorithm (TA)

•  Read from index lists in sequen9al order •  Lookup immediately missing scores •  Stop if seen at least k objects with aggregated score higher than aggregated “scan line” (=τ)


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Step 1

•  Start seq. scanning. See doc3, lookup its score in list 2 and 3. Get:

•  doc3: 18+7+12 = 37


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Step 2

•  Con9nue with doc1 seen in list 2. •  doc1: 0+9+19 = 28 •  doc3: 18+7+12 = 37


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Step 3 •  doc3: 18+7+12 = 37 •  doc1: 0+9+19 = 28 •  Scan line scores: 18+9+19=46 •  We cannot stop now, why?


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Step 4 •  doc3: 18+7+12 = 37 •  doc1: 0+9+19 = 28; doc4 = 12+0+15= 27 •  Scan line scores: 12+9+19=40 •  We cannot stop now, why?


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

7/18/13

5

Step 5 •  doc3: 18+7+12 = 37; doc1: 0+9+19 = 28; doc4: 12+0+15 = 27

•  Scan line scores: 12+7+19=38 •  S9ll cannot stop!


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Step 6 •  doc3: 18+7+12 = 37; doc1: 0+9+19 = 28; doc4: 12+0+15 = 27

•  Scan line scores: 12+7+15 = 34 •  We can stop!


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Restric9on of Random Accesses •  Recall numbers:

– Disk seek 10,000,000 ns – Read 1 MB sequen9ally from disk 30,000,000 ns – Also network rountrip vs. transfer rate


•  Varia9ons of threshold algorithms that consider tradeoff between random and sequen9al accesses

•  Or prohibit random accesses at all •  => No Random Access (NRA) Algorithm

NRA •  Keep for each item two scores: actually seen score (=worstscore) and upper bound score (=bestscore)

•  bestscore = worstscore + best possible scores in lists the item has not been seen before (highi for list i)

•  Stop if no (not top-‐k item) has bestscore be_er than score of currently top-‐k item (called mink)

28

document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Top-‐1 (k=1) Query


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2

Id worstscore bestscore

doc3 18 -‐

mink=18

Call worstscore of document at rank k the mink threshold

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 18 -‐

doc1 9 -‐

mink=18

Bookkeep

ing

7/18/13

6


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 18 46

doc1 28 46

mink=28

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 18 46

doc1 28 40

doc4 12 40

mink=28

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 25 44

doc1 28 40

doc4 12 38

mink=28

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 25 40

doc1 28 40

doc4 27 34

mink=28

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 25 40

doc1 28 40

doc4 27 34

doc2 11 33

mink=28

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 25 40

doc1 28 40

doc4 27 34

doc2 13 28

mink=28

Bookkeep

ing

7/18/13

7


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 37 37

doc1 28 40

doc4 27 34

doc2 13 25

mink=37

Bookkeep

ing


document score

doc3 18

doc4 12

doc2 11

doc5 4

doc6 2

document score

doc1 9

doc3 7

doc2 2

doc6 1

doc7 1

document score

doc1 19

doc4 15

doc3 12

doc5 5

doc2 2


doc3 37 37

doc1 28 32

doc4 27 34

doc2 13 25

doc5 4 18 mink=37

Bookkeep

ing

We can stop!

NRA Bookkeeping and Correctness •  Keep all candidate items in memory that have a chance to get into the final top-‐k result

•  i.e., can throw away (aka. prune) all candidates with best possible score worse than worstscore of the rank-‐k item

•  Observa9on: worstscore is increasing, while bestscore is decreasing (as score at scan lines go down, as lists are sorted in decreasing score order)


NRA Pseudocode top-‐k := ∅; candidates := ∅; mink := 0; scan all lists Li (i = 1..m) in parallel: consider item d at posi9on posi in Li; E(d) := E(d) ∪ {i}; //remember for each item where we saw it

highi := si(qi,d); //maintain scores at scan line worstscore(d) := aggr{sν(qν,d)|ν∈E(d)};

bestscore(d):= aggr{aggr{sν(qν,d)|ν∈E(d)}, aggr{highν|ν∉E(d)}}; if worstscore(d) > mink then //put it into top-‐k set remove argmind’{worstscore(d’)|d’∈top-‐k} from top-‐k; add d to top-‐k mink := min{worstscore(d’) | d’ ∈ top-‐k}; //update mink else if bestscore(d) > mink then //else keep it as candidate candidates := candidates ∪ {d}; threshold := max {bestscore(d’) | d’∈ candidates}; if threshold ≤ mink then exit;


Observa9on: pruning oien overly conservaQve (deep scans, high memory consump9on)

Evolu9on of a Candidate’s Score

•  Approximate top-‐k – “What is the probability that d qualifies for the top-‐k ?”


scan depth

bestscored

worstscored

mink

score drop d

from the candidate queue

Probabilis9c Pruning

•  NRA based on invariant

•  Relaxed into probabilis2c threshold test

•  Or equivalently, with

si (d) ! s(d) ! si (d)+ highi"E (d )#

i$E (d )#

i$E (d )#

p(d) := P si (d)+ si (d)i!E (d )"

i#E (d )" >mink

$

%&&

'

())* !

Distributed Data Management, SoSe 2013, S. Michel

bestscored

worstscored

mink

δ(d)

!(d ) :=mink ! {si | i " E (d )# }

worstscored bestscored

p(d) = P si (d)> !(d)i!E (d )"

#

$%%

&

'(() "

Theobald et al.: Top-‐k Query Evalua9on with Probabilis9c Guarantees. VLDB 2004: 648-‐659 42

7/18/13

8

Expected Result Quality •  Missing relevant items: •  Probability pmiss of missing a true top-‐k object equals the probability of erroneously dropping a candidate from considera9on

•  For each candidate pmiss ≤ ε •  P[recall = r/k] = P[precision = r/k] =

•  E[precision] = E[recall] =

)()1( rkmiss

rmiss pp

rk −−⎟⎟⎠

⎞⎜⎜⎝

⎛

P[precision = r / k]* r / k=(1!!)r=0..k"


recall = |returned relevant docs| / |all relevant docs| precision = |returned relevant docs| / |returned docs|

Score Es9ma9on •  Pre-‐compute model of score distribu9on in index list

•  Use it at run9me to obtain expecta9on of score. •  Fi}ng distribu9on func9on or by building histograms (more robust)

images source :hAp://en.wikipedia.org/wiki/Cumula2ve_distribu2on_func2on Distributed Data Management, SoSe 2013, S. Michel 44

Recap (?) Histograms

•  A histogram par99ons a domain into cells (also called buckets). For bucket the number of elements that fall into this bucket is kept.

•  Two basic histogram kinds: – Equi Width: Buckets have the same width (=size on “x-‐axis”)

•  E.g., 100 buckets on [0,1] interval – Equi Depth: Buckets have same height (“y-‐axis”)

•  to achieve this, width is adapted


0

S1

1 high1

Score Es9ma9ons

•  Build equi-‐width histogram for each index list’s score distribu9on.


•  Then can lookup probability that score is larger or smaller than a specific value.

•  Considering also that we read already parts of the list (un9l score = highi)

Convolu9on


( f *h)(l) = f (i)*h( j)l=i+ j!

0

S1

1 high1

S2

high2 1 0

Convolution (S1,S2)

2 0 δ(d)

P[d gets in the final top-k] =

Illustra9ons on this slide are based on material from Mar9n Theobald

Given two histograms, compute histogram that represents the score distribu9on aier summa9on:

Sample Varia9ons •  Approximate version of TA: allow to stop already earlier; if seen at least k items with score larger or equal to τ/θ (τ is scan line score; θ>1)*

•  Combined algorithm (CA): cost model to trade off random and sequen9al accesses.*

•  Data sources can be random access only or sequen9al access only (or both), par9cularly on the Web.


N. Bruno, L. Gravano, and A. Marian. Proc. of the 18th IEEE Interna9onal Conference on Data Engineering (ICDE’02), 2002.

*) Fagin et al. Op9mal aggrega9on algorithms for middleware. J. Comput. Syst. Sci. 66(4): 614-‐656 (2003)

7/18/13

9

Top-‐k Queries in Distributed Environments

•  Each index list is stored at different node (in general) in a (possibly wide area) network

•  Key Observa9ons: –  Network traffic is crucial –  Number of round trips is crucial

•  Straight forward applica9on of TA/NRA? –  expensive: huge number of rounds trips –  even with batching: unpredictable performance


query ini9ator

Three Phase Uniform Threshold Algorithm (TPUT)


Exactly 3 phases: 1.   fetch k best entries (d, sj) from each of N1 ... Nm and

aggregate (∑j=1..m sj(d)) at query ini9ator 2.  ask each of N1 ... Nm for all entries with sj > mink / m and

aggregate results at query ini9ator. min-‐k is score of item currently at rank k.

3.   fetch missing scores for all candidates by random lookups at N1 ... Nm

Distributed top-‐k algorithm with fixed number of phases!

Pei Cao, Zhe Wang: Efficient top-‐K query calcula9on in distributed networks. PODC 2004: 206-‐215

...

Index List

Node Ni

Coordinator current top-‐k -‐

candidate set

...

score

Index List

Node Nj

score

top k top

k

cand

idates

cand

idates

mink / m mink / m

min-‐k / m

Retrieve missing scores

Retrieve

missing

scores


Correctness of TPUT •  Theorem: TPUT is an exact algorithm, i.e. iden9fies the true top-‐k items

Distributed Data Management, SoSe 2013, S. Michel

n  Proof (sketch): TPUT cannot miss a true top-‐k item. Assume it misses one, i.e. item is below mink/m in all lists.

à overall score < mink à not a true top-‐k item!

list 1 list 2 list 3

mink score < mink

State aier phase 2:

52

Op9miza9ons •  Performance depends on choice of threshold mink/m

•  (But proof as well)

•  Tradeoffs in result quality are (par9ally) acceptable


Increase of mink/m Threshold

•  Get extra informa9on about items’ scores •  But with li_le overhead, compact


1 1 1 0 0 0 •  Create Bloom filter synopses for each index list?

•  Very coarse. Can say if it is in or not (modulo false posi9ves), not how good the score is

0 1

?

7/18/13

10

Increase of mink/m Threshold (Cont’d) •  Create one Bloom filter for each histogram cell. •  In phase 1: get k from each list + “some” bloom filters for top (high score) cells


0 1

•  Es9mate score of item based on actually seen one plus lower bound score of histogram cell of filter it is in (conserva9ve es9ma9on)

01010101010111

111011101010111

1010101111111

……

Increase of mink/m Threshold

...

Index List

Node Ni

Coordinator current top-‐k -‐

candidate set

...

score

Index List

Node Nj

Histogram Histogram

b bits

0 0

0 1

0 1

1 0

0 0

0 1

0 1

1 0

0 1

0 1

1 0

1 0

0 1

0 1

1 0

1 0

0 1

0 0

1 0

1 0

0 1

0 0

1 0

1 0

0 0

0 1

0 0

1 0

0 0

0 1

0 0

1 0

0 1

0 0

0 1

1 1

0 1

0 0

0 1

1 1

c cells

b bits

0 1

0 1

0 1

0 1

0 1

0 1

0 1

0 1

0 0

0 1

1 1

0 1

0 0

0 1

1 1

0 1

0 1

0 0

0 0

0 0

0 1

0 0

0 0

0 0

0 0

0 0

0 0

1 0

0 0

0 0

0 0

1 0

0 1

0 0

1 1

1 0

0 1

0 0

1 1

1 0

c cells

score

top k top

k

cand

idates

cand

idates

mink / m mink / m


Further Op9miza9ons

•  Non uniform thresholds

•  Node sampling

•  Hierarchical aggrega9on


Literature •  Ronald Fagin: Combining Fuzzy Informa9on from Mul9ple Systems. J.

Comput. Syst. Sci. 58(1): 83-‐99 (1999) •  Ronald Fagin, Amnon Lotem, Moni Naor: Op9mal aggrega9on algorithms

for middleware. J. Comput. Syst. Sci. 66(4): 614-‐656 (2003) •  Mar9n Theobald, Gerhard Weikum, Ralf Schenkel: Top-‐k Query Evalua9on

with Probabilis9c Guarantees. VLDB 2004: 648-‐659 •  Pei Cao, Zhe Wang: Efficient top-‐K query calcula9on in distributed

networks. PODC 2004: 206-‐215 •  Sebas9an Michel, Peter Triantafillou, Gerhard Weikum: KLEE: A

Framework for Distributed Top-‐k Query Algorithms. VLDB 2005: 637-‐648


distributed top-k query processing · 7/18/13 6...

Documents