a specialized b -tree for concurrent datalogevaluation · herbert jordan1,pavlesubotić3, david...

24
Herbert Jordan 1 , Pavle Subotić 3 , David Zhao 2 , and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized B-tree for Concurrent Datalog Evaluation 1) University of Innsbruck 2) University of Sydney 3) University College London

Upload: others

Post on 07-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Herbert Jordan1, Pavle Subotić3, David Zhao2, and Bernhard Scholz2

PPoPP 2019, 16-20 February 2019, Washington, DC

A Specialized B-tree for Concurrent Datalog Evaluation

1) University of Innsbruck2) University of Sydney3) University College London

Page 2: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Datalog (by Example)

2

a

b c

d e

gf

from toa ba cb fc ed ad c… …

edge relationgraph

Which nodes are connected?

Page 3: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Datalog (by Example)

3

path(X,Y) :- edge(X,Y).

path(X,Z) :- path(X,Y), edge(Y,Z).

a

b c

d e

gf

from toa ba cb fc ed ad c… …

graph Datalogqueryedge relation

Page 4: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Datalog› Benefits:

– a concise formalism for powerful data analysis– lately major performance improvements and tool support

› Applications:– data base queries– program analysis– security vulnerability analysis– network analysis

4

100s of relations and rules,billions of tuples,all in-memory

Page 5: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Query Processing

5

relations set of integer tuples

rulessequence of

relational algebraoperations on sets

Page 6: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Example

6

path(X,Z) :- path(X,Y), edge(Y,Z).

!"#ℎ ← !"#ℎ ∪ '()'() ← *(,(-#" ⋈ (,/() ∖ !"#ℎ

,(-#" ← !"#ℎ

,(-#" ← '()

while ( ,(-#" ≠ ∅ ) {

}

computationalexpensive anddominating part

Page 7: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Example

7

!"# ← %('"()* ⋈ "',") ∖ /*)ℎ

Relation new;for t1 ∈ delta {auto l = edge.lower_bound( { t1[1], 0 } );auto u = edge.upper_bound( { t1[1]+1, 0 } );for t2 ∈ [l,u] {Tuple t3 = { t1[0], t2[1] };if ( t3 ∉ path ) {

new.insert(t3);}

}}

Page 8: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Example

8

!"# ← %('"()* ⋈ "',") ∖ /*)ℎ

Relation new;#pragma omp parallel forfor t1 ∈ delta {auto l = edge.lower_bound( { t1[1], 0 } );auto u = edge.upper_bound( { t1[1]+1, 0 } );for t2 ∈ [l,u] {Tuple t3 = { t1[0], t2[1] };if ( t3 ∉ path ) {

new.insert(t3);}

}}

all read accesses(right hand side)

one write access(assignment)

But: write target isnever read onright hand side!

Page 9: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Needed› efficient data structure for relations

– maintain set of n-dimensional tuples

– efficient support for› insertion, › scans, › range queries,› membership tests, › emptiness checks

– efficient synchronization of concurrent inserts

9

well supported by B-trees

not so much …

Page 10: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

B-tree

› Insertion:– locate target leaf node– split leaf node if necessary, may propagate up– insert element in sorted leaf-node element array

10

(9,4)(9,2)(8,7)(1,1) (1,2) (3,2) (4,7) (6,9) (7,4)

(5,3) (8,2)

Page 11: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

B-tree Locking Strategies

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

millioninse

rtions

/ se

cond

number of threads

global lock btree fine grained btree read write btree

11

random order, on 4x8 core Intel Xeon E5-4650

Page 12: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Seqlocks / Optimistic Locking

12

Shared Data

SN: 1234

sequence number:• even: no writer active• odd: one writer active

Page 13: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Seqlocks› Reader:

› Writer:

13

read seqnumber

check seqnumber

<seq number changed>

perform read access

set last bit in SN

inc seqnumber

<not even>

perform write access

<was not even>

Shared Data

SN: 1234

Page 14: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Optimistic Read/Write Lock› Reader

14

read seqnumber

check seqnumber

<seq number changed>

perform read access

<not even>

Shared Data

SN: 1234

perform read access

need write access?

set last bit in SN

inc seqnumber

perform write access<fail>

<yes>

Page 15: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Optimistic B-tree› Protect nodes and root pointer with optimistic R/W lock

› Synchronize insert operation– read access on inner nodes, update to write when necessary

› Key challenge:– pointer indirection– concurrencymemory model

15

(9,4)(9,2)(8,7)(1,1) (1,2) (3,2) (4,7) (6,9) (7,4)

(5,3) (8,2)SN: 12

SN: 1234 SN: 321 SN: 254

SN: 4

Page 16: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

B-tree Locking Strategies (cont)

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

millioninse

rtions

/ se

cond

number of threads

global lock btree fine grained btree read write btree optimistic btree

16

random order, on 4x8 core Intel Xeon E5-4650

Page 17: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Sequential Performance

17

05

10152025

1000² 2000² 5000² 10000²

million inse

rtions

/s

total elements inserted

google btree seq btree btree

STL rbtset STL hashset TBB hashset

012345

1000² 2000² 5000² 10000²

million inse

rtions

/s

total elements inserted

google btree seq btree btree

STL rbtset STL hashset TBB hashset

ordered insertion random order insertion

(additional data structures covered in paper)

Page 18: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Parallel Performance

18

0

10

20

30

40

50

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

million inse

rtions

/s

number of threads

btree google btree

reduction btree TBB hashset

0

2

4

6

8

10

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

million inse

rtions

/s

number of threads

btree google btree

reduction btree TBB hashset

ordered insertion random order insertion

4x8 core Intel Xeon E5-4650

up to 10x faster than TBBup to 27x faster than TBB

Page 19: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Other Concurrent Tree Data Structures

0

5

10

15

20

ordered random

million inse

rtions

/sec

ond

Insertion order

B-tree Masstree B-slack PALM tree

19

0

20

40

60

80

100

120

ordered random

million inse

rtions

/sec

ond

Insertion order

B-tree Masstree B-slack PALM tree

single-threaded 8 threads

1.5 - 2.6x faster1.5 - 2.9x faster

Page 20: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Datalog Query Processing

0

500

1000

1500

2000

2500

3000

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

total q

uery tim

e [s]

number of threads

btree STL rbtset STL hashset

google btree TBB hashset

20

0

100

200

300

400

500

600

700

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

total q

uery tim

e [s]

number of threads

btree STL rbtset STL hashset

google btree TBB hashset

context sensitivevar-points-to analysis

security vulnerabilityanalysis

1.4 - 6.3x faster1.4 - 7.7x faster

Page 21: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Conclusion› Developed concurrent set for Datalog relations:

– B-tree foundation› good sequential performance, cache friendly

– Fine-grained synchronization› based on customized seqlock variant

› Results:– up to 59x faster than state-of-the-art hash based sets– up to 2.9x faster than state-of-the-art tree based sets– up to 7.7x faster for real-world query processing

› Future work:– investigate other data structures for specialized use cases

21

Page 22: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

Thank you!visit us on https://souffle-lang.github.io

sources: https://github.com/souffle-lang/souffle

22

Page 23: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

B-tree Locking Strategies

0

1

2

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

millioninse

rtions

/ se

cond

number of threads

global lock btree fine grained btree read write btree tbb::hash_set reduction btree

44

random order, on 4x8 core Intel Xeon E5-4650

Page 24: A Specialized B -tree for Concurrent DatalogEvaluation · Herbert Jordan1,PavleSubotić3, David Zhao 2, and Bernhard Scholz 2 PPoPP 2019, 16-20 February 2019, Washington, DC A Specialized

B-tree Locking Strategies (cont)

0

2

4

6

8

10

12

14

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32

millioninse

rtions

/ se

cond

number of threads

global lock btree fine grained btree read write btree

tbb::hash_set reduction btree optimistic btree

45

random order, on 4x8 core Intel Xeon E5-4650