Transcript
Page 1: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

I/O-efficient Algorithms and Data Structures

Lars Arge

Professor and center director

Aarhus University

ALMADA summer schoolAugust 5, 2013

Page 2: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 2

I/O-Model

• Parameters

N = # elements in problem instance

B = # elements that fits in disk block

M = # elements that fits in main memory

T = # output size in searching problem

• We often assume that M>B2

• I/O: Movement of block between memory and disk

D

P

M

Block I/O

I/O-efficient Algorithms and Data Structures

Page 3: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 3

Fundamental Bounds Internal External

• Scanning: N• Sorting: N log N• Permuting • Searching:

• Note:– Linear I/O: O(N/B)– Permuting not linear– Permuting and sorting bounds are equal in all practical cases– B factor VERY important: – Cannot sort optimally with search tree

NBlog

BN

BN

BMlog

BN

NBN

BN

BN

BM log

}log,min{BN

BN

BMNN

N2log

I/O-efficient Algorithms and Data Structures

Page 4: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 4

External Sort• Merge sort:

– Create N/M memory sized sorted runs– Merge runs together M/B at a time

phases using I/Os each

• Distribution sort similar (but harder – split elements)

)( BNO)(log

MN

BMO

I/O-efficient Algorithms and Data Structures

Page 5: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lower bounds• Permuting N elements according to a given permutation takes

I/Os in “indivisibility” model

• Sorting N elements takes I/Os in comparison model

– Proved using counting arguments

Lars Arge 5

I/O-efficient Algorithms and Data Structures

})log,(min{BN

BN

BMN

)log(BN

BN

BM

!)!())(( NBN BN

B

M t

Page 6: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

6Lars Arge

Cache-Oblivious Model

• N, B, and M as in I/O-model– Assumed that M>B2 (tall cache assumption)

• M and B not used in algorithm description• Block transfers by optimal paging strategy

Analyze in two-level modelEfficient on one level,

efficient on all levels (of fully associative hierarchy using LRU)

I/O-efficient Algorithms and Data Structures

Page 7: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

7Lars Arge

Cache-Oblivious Distribution Sort• Break into subarrays of size

– Recursively sort each subarray• Distribute into buckets of size ~

– Recursively sort each bucket

• Distribution performed in I/Os

=

N

I/O-efficient Algorithms and Data Structures

)log(BN

BN

BMO

N

N

N

MNO

MNONTNNT

BN

BN

if)(

if)()(2)(

)(BNO

1 23 4

Page 8: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 8

Sorting summary• External merge or distribution sort takes I/Os

– Merge-sort based on M/B-way merging– Distribution sort based on -way distribution

and (complicated) split elements finding– Also cache-oblivious using -way distribution

• Optimal in comparison model

• lower bound in stronger indivisibility model– Holds even for permuting

)log(BN

BN

BMO

})log,(min{BN

BN

BMN

BM

I/O-efficient Algorithms and Data Structures

N

Page 9: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Distribution sweeping• Combination of distribution and plane sweep

– Used to solve batched geometric problems• General idea:

– Divide plane into M/B slabs with O(N/(M/B)) endpoints each– Sweep plane top-down while finding solutions involving

objects in different slabs– Distribute data to M/B slabs recourse in each slab

• Examples:– Orthogonal line segment intersection– Rectangle intersection– Homework

Lars Arge 9

I/O-efficient Algorithms and Data Structures

Page 10: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Outline• Thursday:

– Sorting upper bound (merge- and distribution-sort)– Sorting (permuting) lower bound– Cache-oblivious sorting– Batched geometric problems: Distribution sweeping

• Today:– Searching (B-trees)– Batched searching (Buffer-trees)– I/O-efficient priority queues– I/O-efficient list ranking

Lars Arge 10

I/O-efficient Algorithms and Data Structures

Page 11: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

11Lars Arge

– If nodes stored arbitrarily on diskÞ Search in I/OsÞ Rangesearch in I/Os

• Binary search tree:– Standard method for search among N elements– We assume elements in leaves

– Search traces at least one root-leaf path

External Search Trees

)(log2 NO

)(log2 N

)(log2 TNO

I/O-efficient Algorithms and Data Structures

Page 12: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 12

External Search Trees

• BFS blocking:– Block height– Output elements blocked

Rangesearch in I/Os

• Optimal: O(N/B) space and query

)(log2 B

)(B

)(log)(log/)(log 22 NOBONO B

)(log BT

B N )(log B

TB N

I/O-efficient Algorithms and Data Structures

Page 13: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

13Lars Arge

Cache-Oblivious Search Tree• “Van Emde Boas” recursive layout:

NBlog

B1 B

A

B2 N

Nlog21

Nlog21

Recursive memory layout:

A B1 B2 BN

I/O-efficient Algorithms and Data Structures

Page 14: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

14Lars Arge

Cache-Oblivious Search Tree• Search analysis:

– Conceptually stop recursion when recursive subtrees have size and each recursive subtree fits in a block

NBlog

– Search path “hits” blocks)(log)(log/log NOBN BB

BB

B

B

BB

)(log B

I/O-efficient Algorithms and Data Structures

Page 15: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 15

• Maintaining BFS blocking during updates?– Balance normally maintained in search trees using rotations

• Seems very difficult to maintain BFS blocking during rotation– Also need to make sure output (leaves) is blocked!

Dynamic External Search Trees?

x

y

x

y

I/O-efficient Algorithms and Data Structures

Page 16: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 16

B-trees• BFS-blocking naturally corresponds to tree with fan-out

• B-trees balanced by allowing node degree to vary– Rebalancing performed by splitting and merging nodes

)(B

I/O-efficient Algorithms and Data Structures

Page 17: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 17

• (a,b)-tree uses linear space and has height

Choosing a,b = each node/leaf stored in one disk block

O( /N B) space and query

(a,b)-tree• T is an (a,b)-tree (a≥2 and b≥2a-1)

– All leaves on the same level and contain between a and b elements

– Except for the root, all nodes have degree between a and b

– Root has degree between 2 and b

)(log NO a

)(log BT

B N

)(B

(2,4)-tree

I/O-efficient Algorithms and Data Structures

Page 18: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 18

(a,b)-Tree Insert• Insert:

Search and insert element in leaf v

DO v has b+1 elements/children

Split v:

make nodes v’ and v’’ with

and elements

insert element (ref) in parent(v)

(make new root if necessary)

v=parent(v)

• Insert touch nodes

bb 2

1 ab 2

1

)(log Na

v

v’ v’’

21b 2

1b

1b

I/O-efficient Algorithms and Data Structures

Page 19: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 19

(2,4)-Tree Insert

I/O-efficient Algorithms and Data Structures

Page 20: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

20

(a,b)-Tree Delete• Delete:

Search and delete element from leaf v

DO v has a-1 elements/children

Fuse v with sibling v’:

move children of v’ to v

delete element (ref) from parent(v)

(delete root if necessary)

If v has >b (and ≤ a+b-1<2b) children split v

v=parent(v)

• Delete touch nodes )(log NO a

v

v

1a

12 a

I/O-efficient Algorithms and Data Structures

Lars Arge

Page 21: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 21

(2,4)-Tree Delete

I/O-efficient Algorithms and Data Structures

Page 22: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 22

• (a,b)-tree properties:– If b=2a-1 every update can

cause many rebalancing

operations

– If b≥2a update only cause O(1) rebalancing operations amortized– If b>2a only rebalancing operations amortized

* Both somewhat hard to show– If b=4a easy to show that update causes rebalance

operations amortized* After split during insert a leaf contains 4a/2=2a elements* After fuse during delete a leaf contains between 2a and

5a elements (split if more than 3a between 3/2a and 5/2a)

(a,b)-Tree

)()( 11

2aa

OO b

)log( 1 NO aa

insert

delete

(2,3)-tree

I/O-efficient Algorithms and Data Structures

Page 23: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

23Lars Arge

B-tree Summary• B-trees: (a,b)-trees with a,b =

– O(N/B) space– O(logB N+T/B) query

– O(logB N) update

• B-trees with elements in the leaves sometimes called B+-tree

• Construction in I/Os– Sort elements and construct leaves– Build tree level-by-level bottom-up

)(B

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 24: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

24Lars Arge

Generalized B-tree• B-tree with branching parameter b and leaf parameter k (b,k≥8)

– All leaves on same level and contain between 1/4k and k elements– Except for the root, all nodes have degree between 1/4b and b– Root has degree between 2 and b

• B-tree with leaf parameter – O(N/B) space– Height – amortized leaf rebalance operations– amortized internal node rebalance operations

)(logBN

bO)( 1

kO

)log( 1BN

bkbO

)(Bk

I/O-efficient Algorithms and Data Structures

Page 25: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

B-tree Sorting• Normally we can sort N elements in O(N log N) time with search tree

– Insert all elements one-by-one (construct tree)– Output in sorted order using in-order traversal

• Same algorithm using B-tree use I/Os– A factor of non-optimal

• We would like to have dynamic data structure to use in algorithms I/O operations

)log( NNO B

)(log

log

BBM

BO

)log(BN

BMBNO )log( 1

BN

BMBO

I/O-efficient Algorithms and Data Structures

25Lars Arge

Page 26: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

26

• Main idea: Logically group nodes together and add buffers– Insertions done in a “lazy” way – elements inserted in buffers– When a buffer runs full elements are pushed one level down– Buffer-emptying in O(M/B) I/Os

every block touched constant number of times on each level

inserting N elements (N/B blocks) costs I/Os)log(BN

BMBNO

Buffer-tree Technique

B

B

M elements

fan-out M/B)(log

BN

BMO

I/O-efficient Algorithms and Data Structures

Lars Arge

Page 27: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

27Lars Arge

• Definition:– B-tree with branching parameter and leaf parameter B – Size M buffer in each internal node

• Updates:– Add time-stamp to insert/delete element– Collect B elements in memory before inserting in root buffer– Perform buffer-emptying when buffer runs full

Basic Buffer-tree

BM

$m$ blocksMBM

BM ...

41

B

I/O-efficient Algorithms and Data Structures

Page 28: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

28Lars Arge

Basic Buffer-tree• Note:

– Buffer can be larger than M during recursive buffer-emptying* Elements distributed in sorted order

at most M elements in buffer unsorted– Rebalancing needed when “leaf-node” buffer emptied

* Leaf-node buffer-emptying only performed after all full internal node buffers are emptied

$m$ blocksMBM

BM ...

41

B

I/O-efficient Algorithms and Data Structures

Page 29: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

29Lars Arge

Basic Buffer-tree• Internal node buffer-empty:

– Load first M (unsorted) elements into

memory and sort them– Merge elements in memory with rest

of (already sorted) elements– Scan through sorted list while

* Removing “matching” insert/deletes* Distribute elements to child buffers

– Recursively empty full child buffers

• Emptying buffer of size X takes O(X/B+M/B)=O(X/B) I/Os

$m$ blocksMBM

BM ...

41

I/O-efficient Algorithms and Data Structures

Page 30: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

30Lars Arge

Basic Buffer-tree• Buffer-empty of leaf node with K elements in leaves

– Sort buffer as previously– Merge buffer elements with elements in leaves– Remove “matching” insert/deletes obtaining K’ elements– If K’<K then

* Add K-K’ “dummy” elements and insert in “dummy” leaves

Otherwise* Place K elements in leaves* Repeatedly insert block of elements in leaves and rebalance

• Delete dummy leaves and rebalance when all full buffers emptied

K

I/O-efficient Algorithms and Data Structures

Page 31: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

31Lars Arge

Basic Buffer-tree• Invariant:

Buffers of nodes on path from root to emptied leaf-node are empty

• Insert rebalancing (splits)

performed as in normal B-tree

• Delete rebalancing: v’ buffer emptied before fuse of v– Necessary buffer emptyings performed before next dummy-

block delete– Invariant maintained

v vv’

v v’ v’’

I/O-efficient Algorithms and Data Structures

Page 32: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

32Lars Arge

Basic Buffer-tree• Analysis:

– Not counting rebalancing, a buffer-emptying of node with X ≥ M elements (full) takes O(X/B) I/Os

total full node emptying cost I/Os– Delete rebalancing buffer-emptying (non-full) takes O(M/B) I/Os

cost of one split/fuse O(M/B) I/Os– During N updates

* O(N/B) leaf split/fuse* internal node split/fuse

Total cost of N operations: I/Os

)log(BN

BM

BM

BN

O

)log(BN

BN

BMO

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 33: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

33Lars Arge

Basic Buffer-tree• Emptying all buffers after N insertions:

Perform buffer-emptying on all nodes in BFS-order

resulting full-buffer emptyings cost I/Os

empty non-full buffers using O(M/B) O(N/B) I/Os

• N elements can be sorted using buffer tree in I/Os

)log(BN

BN

BMO

)(B

MB

N

O

$m$ blocksMBM

BM ...

41

B

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 34: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

34Lars Arge

• Batching of operations on B-tree using M-sized buffers– I/O updates amortized– All buffers emptied in I/Os

• One-dim. rangesearch operations can also be supported in

I/Os amortized– Search elements handle lazily like updates– All elements in relevant sub-trees

reported during buffer-emptying– Buffer-emptying in O(X/B+T’/B),

where T’ is reported elements

Buffer-tree Summary

)log( 1BN

BMBO

)log( 1BT

BN

BMBO

$m$ blocks

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 35: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

35Lars Arge

Orthogonal Line Segment Intersection

• Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates)– Top endpoint of vertical segment: Insert in T– Bottom endpoint of vertical segment: Delete from T– Horizontal segment: Perform range query with x-interval on T

I/Os using buffer-tree

I/O-efficient Algorithms and Data Structures

)log(BT

BN

BN

BMO

Page 36: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

36Lars Arge

• Basic buffer tree can be used in external priority queue• To delete minimal element:

– Empty all buffers on leftmost path– Delete elements in leftmost

leaf and keep in memory– Deletion of next M minimal

elements free– Inserted elements checked against

minimal elements in memory

• I/Os every O(M) delete amortized

Buffered Priority Queue

)log(BN

BMBMO

M41

)log( 1BN

BMBO

)(BM

B

I/O-efficient Algorithms and Data Structures

Page 37: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

37Lars Arge

List Ranking• Problem:

– Given N-vertex linked list stored in array– Compute rank (number in list) of each vertex

• One of the simplest graph problem one can think of

• Straightforward O(N) internal algorithm– Also uses O(N) I/Os in external memory

• Much harder to get external algorithm

3 4 5 9 68 27 101 5 2 6 73 4 108 9

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 38: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

38Lars Arge

List Ranking• We will solve more general problem:

– Given N-vertex linked list with edge-weights stored in array– Compute sum of weights (rank) from start for each vertex

• List ranking: All edge weights one

• Note: Weight stored in array entry together with edge (next vertex)

1 5 2 6 73 4 108 9

1 1 11 111 1

1 1

I/O-efficient Algorithms and Data Structures

Page 39: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

39Lars Arge

List Ranking

• Algorithm:

1. Find and mark independent set of vertices

2. “Bridge-out” independent set: Add new edges

3. Recursively rank resulting list

4. “Bridge-in” independent set: Compute rank of independent set

• Step 1, 2 and 4 in I/Os• Independent set of size αN for 0 < α ≤ 1

I/Os

11 111 1 1 11 12 2 2

1 3 4 6 8 9 102 5 7

)log(BN

BMBNO

)log()log())1(()(BN

BMBN

BN

BMBN OONTNT

I/O-efficient Algorithms and Data Structures

Page 40: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

40Lars Arge

List Ranking: Bridge-out/in

• Obtain information (edge or rang) of successor – Make copy of original list– Sort original list by successor id– Scan original and copy together to obtain successor information– Sort modified original list by id

I/Os

11

23 4 5 9 68 27 102 3 4 95 86 7 103 4 5 9 68 27 103 4 8 9 627 10

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 41: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

41Lars Arge

List Ranking: Independent Set• Easy to design randomized algorithm:

– Scan list and flip a coin for each vertex– Independent set is vertices with head and successor with tails

Independent set of expected size N/4

• Deterministic algorithm:– 3-color vertices (no vertex same color as predecessor/successor) – Independent set is vertices with most popular color

Independent set of size at least N/3

• 3-coloring I/O algorithm)log(BN

BMBNO )log(

BN

BMBNO

)log(BN

BMBNO

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 42: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

42Lars Arge

List Ranking: 3-coloring• Algorithm:

– Consider forward and backward lists (heads/tails in two lists)– Color forward lists (except tail) alternately red and blue– Color backward lists (except tail) alternately green and blue

3-coloring

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 43: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

43Lars Arge

List Ranking: Forward List Coloring• Identify heads and tails• For each head, insert red element in priority-queue (priority=position)• Repeatedly:

– Extract minimal element from queue– Access and color corresponding element in list– Insert opposite color element corresponding to successor in queue

• Scan of list• O(N) priority-queue operations

I/Os

`3 4 5 9 68 27 10

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 44: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

44Lars Arge

List Ranking Summary• Simplest graph problem: Traverse linked list

• Very easy O(N) algorithm in internal memory• Much more difficult external memory

– Finding independent set via 3-coloring– Bridging vertices in/out

• Permuting bound best possible– Also true for other graph problems

)log(BN

BMBNO

})log,(min{BN

BN

BMNO

3 4 5 9 68 27 101 5 2 6 73 4 108 9

I/O-efficient Algorithms and Data Structures

Page 45: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

45Lars Arge

List Ranking Summary• External list ranking algorithm similar to PRAM algorithm

– Sometimes external algorithms by “PRAM algorithm simulation”

• Forward list coloring algorithm example of “time forward processing”– Use external priority-queue to send information “forward in time”

to vertices to be processed later

• PRAM algorithm ideas and time forward processing used in many graph algorithms

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 46: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

References• External Memory Geometric Data Structures (section 1-5)

Lecture notes by L. Arge

• I/O-Efficient Graph Algorithms (section 1-2)

Lecture notes by N. Zeh

• Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99.

Lars Arge 46

I/O-efficient Algorithms and Data Structures

Page 47: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Summary• We discussed

– Sorting– Hardness of permuting (sorting)– Batched geometric problems– Searching: B-trees, buffer-trees, priority-queue– Graph algorithms: List-ranking– Cache-oblivious algorithms

• Important techniques– Multiway merging/distribution– Distribution sweeping– Multiway-trees, buffered data structures– Time-forward processing (and PRAM simulation)

Lars Arge 47

I/O-efficient Algorithms and Data Structures

Page 48: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Thanks

[email protected]

www.madalgo.au.dk


Top Related