i/o efficient algorithms and data structures 3-4 (lecture by lars arge)

48
I/O-efficient Algorithms and Data Structures Lars Arge Professor and center director Aarhus University ALMADA summer school August 5, 2013

Upload: anton-konushin

Post on 11-May-2015

775 views

Category:

Technology


3 download

DESCRIPTION

In many modern applications that deal with massive data sets, communication between internal and external memory, and not actual computation time, is the bottleneck in the computation. This is due to the huge difference in access time of fast internal memory and slower external memory such as disks. In order to amortize this time over a large amount of data, disks typically read or write large blocks of contiguous data at once. This means that it is important to design algorithms with a high degree of locality in their disk access pattern, that is, algorithms where data accessed close in time is also stored close on disk. Such algorithms take advantage of block transfers by amortizing the large access time over a large number of accesses. In the area of I/O-efficient algorithms the main goal is to develop algorithms that minimize the number of block transfers (I/Os) used to solve a given problem. In these four lectures, we will cover fundamental I/O-efficient algorithms and data structures, such as sorting algorithms, search trees and priority queues. We will also discuss geometric problems, as well as geometric data structures for various range searching problems.

TRANSCRIPT

Page 1: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

I/O-efficient Algorithms and Data Structures

Lars Arge

Professor and center director

Aarhus University

ALMADA summer schoolAugust 5, 2013

Page 2: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 2

I/O-Model

• Parameters

N = # elements in problem instance

B = # elements that fits in disk block

M = # elements that fits in main memory

T = # output size in searching problem

• We often assume that M>B2

• I/O: Movement of block between memory and disk

D

P

M

Block I/O

I/O-efficient Algorithms and Data Structures

Page 3: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 3

Fundamental Bounds Internal External

• Scanning: N• Sorting: N log N• Permuting • Searching:

• Note:– Linear I/O: O(N/B)– Permuting not linear– Permuting and sorting bounds are equal in all practical cases– B factor VERY important: – Cannot sort optimally with search tree

NBlog

BN

BN

BMlog

BN

NBN

BN

BN

BM log

}log,min{BN

BN

BMNN

N2log

I/O-efficient Algorithms and Data Structures

Page 4: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 4

External Sort• Merge sort:

– Create N/M memory sized sorted runs– Merge runs together M/B at a time

phases using I/Os each

• Distribution sort similar (but harder – split elements)

)( BNO)(log

MN

BMO

I/O-efficient Algorithms and Data Structures

Page 5: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lower bounds• Permuting N elements according to a given permutation takes

I/Os in “indivisibility” model

• Sorting N elements takes I/Os in comparison model

– Proved using counting arguments

Lars Arge 5

I/O-efficient Algorithms and Data Structures

})log,(min{BN

BN

BMN

)log(BN

BN

BM

!)!())(( NBN BN

B

M t

Page 6: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

6Lars Arge

Cache-Oblivious Model

• N, B, and M as in I/O-model– Assumed that M>B2 (tall cache assumption)

• M and B not used in algorithm description• Block transfers by optimal paging strategy

Analyze in two-level modelEfficient on one level,

efficient on all levels (of fully associative hierarchy using LRU)

I/O-efficient Algorithms and Data Structures

Page 7: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

7Lars Arge

Cache-Oblivious Distribution Sort• Break into subarrays of size

– Recursively sort each subarray• Distribute into buckets of size ~

– Recursively sort each bucket

• Distribution performed in I/Os

=

N

I/O-efficient Algorithms and Data Structures

)log(BN

BN

BMO

N

N

N

MNO

MNONTNNT

BN

BN

if)(

if)()(2)(

)(BNO

1 23 4

Page 8: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 8

Sorting summary• External merge or distribution sort takes I/Os

– Merge-sort based on M/B-way merging– Distribution sort based on -way distribution

and (complicated) split elements finding– Also cache-oblivious using -way distribution

• Optimal in comparison model

• lower bound in stronger indivisibility model– Holds even for permuting

)log(BN

BN

BMO

})log,(min{BN

BN

BMN

BM

I/O-efficient Algorithms and Data Structures

N

Page 9: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Distribution sweeping• Combination of distribution and plane sweep

– Used to solve batched geometric problems• General idea:

– Divide plane into M/B slabs with O(N/(M/B)) endpoints each– Sweep plane top-down while finding solutions involving

objects in different slabs– Distribute data to M/B slabs recourse in each slab

• Examples:– Orthogonal line segment intersection– Rectangle intersection– Homework

Lars Arge 9

I/O-efficient Algorithms and Data Structures

Page 10: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Outline• Thursday:

– Sorting upper bound (merge- and distribution-sort)– Sorting (permuting) lower bound– Cache-oblivious sorting– Batched geometric problems: Distribution sweeping

• Today:– Searching (B-trees)– Batched searching (Buffer-trees)– I/O-efficient priority queues– I/O-efficient list ranking

Lars Arge 10

I/O-efficient Algorithms and Data Structures

Page 11: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

11Lars Arge

– If nodes stored arbitrarily on diskÞ Search in I/OsÞ Rangesearch in I/Os

• Binary search tree:– Standard method for search among N elements– We assume elements in leaves

– Search traces at least one root-leaf path

External Search Trees

)(log2 NO

)(log2 N

)(log2 TNO

I/O-efficient Algorithms and Data Structures

Page 12: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 12

External Search Trees

• BFS blocking:– Block height– Output elements blocked

Rangesearch in I/Os

• Optimal: O(N/B) space and query

)(log2 B

)(B

)(log)(log/)(log 22 NOBONO B

)(log BT

B N )(log B

TB N

I/O-efficient Algorithms and Data Structures

Page 13: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

13Lars Arge

Cache-Oblivious Search Tree• “Van Emde Boas” recursive layout:

NBlog

B1 B

A

B2 N

Nlog21

Nlog21

Recursive memory layout:

A B1 B2 BN

I/O-efficient Algorithms and Data Structures

Page 14: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

14Lars Arge

Cache-Oblivious Search Tree• Search analysis:

– Conceptually stop recursion when recursive subtrees have size and each recursive subtree fits in a block

NBlog

– Search path “hits” blocks)(log)(log/log NOBN BB

BB

B

B

BB

)(log B

I/O-efficient Algorithms and Data Structures

Page 15: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 15

• Maintaining BFS blocking during updates?– Balance normally maintained in search trees using rotations

• Seems very difficult to maintain BFS blocking during rotation– Also need to make sure output (leaves) is blocked!

Dynamic External Search Trees?

x

y

x

y

I/O-efficient Algorithms and Data Structures

Page 16: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 16

B-trees• BFS-blocking naturally corresponds to tree with fan-out

• B-trees balanced by allowing node degree to vary– Rebalancing performed by splitting and merging nodes

)(B

I/O-efficient Algorithms and Data Structures

Page 17: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 17

• (a,b)-tree uses linear space and has height

Choosing a,b = each node/leaf stored in one disk block

O( /N B) space and query

(a,b)-tree• T is an (a,b)-tree (a≥2 and b≥2a-1)

– All leaves on the same level and contain between a and b elements

– Except for the root, all nodes have degree between a and b

– Root has degree between 2 and b

)(log NO a

)(log BT

B N

)(B

(2,4)-tree

I/O-efficient Algorithms and Data Structures

Page 18: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 18

(a,b)-Tree Insert• Insert:

Search and insert element in leaf v

DO v has b+1 elements/children

Split v:

make nodes v’ and v’’ with

and elements

insert element (ref) in parent(v)

(make new root if necessary)

v=parent(v)

• Insert touch nodes

bb 2

1 ab 2

1

)(log Na

v

v’ v’’

21b 2

1b

1b

I/O-efficient Algorithms and Data Structures

Page 19: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 19

(2,4)-Tree Insert

I/O-efficient Algorithms and Data Structures

Page 20: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

20

(a,b)-Tree Delete• Delete:

Search and delete element from leaf v

DO v has a-1 elements/children

Fuse v with sibling v’:

move children of v’ to v

delete element (ref) from parent(v)

(delete root if necessary)

If v has >b (and ≤ a+b-1<2b) children split v

v=parent(v)

• Delete touch nodes )(log NO a

v

v

1a

12 a

I/O-efficient Algorithms and Data Structures

Lars Arge

Page 21: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 21

(2,4)-Tree Delete

I/O-efficient Algorithms and Data Structures

Page 22: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Lars Arge 22

• (a,b)-tree properties:– If b=2a-1 every update can

cause many rebalancing

operations

– If b≥2a update only cause O(1) rebalancing operations amortized– If b>2a only rebalancing operations amortized

* Both somewhat hard to show– If b=4a easy to show that update causes rebalance

operations amortized* After split during insert a leaf contains 4a/2=2a elements* After fuse during delete a leaf contains between 2a and

5a elements (split if more than 3a between 3/2a and 5/2a)

(a,b)-Tree

)()( 11

2aa

OO b

)log( 1 NO aa

insert

delete

(2,3)-tree

I/O-efficient Algorithms and Data Structures

Page 23: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

23Lars Arge

B-tree Summary• B-trees: (a,b)-trees with a,b =

– O(N/B) space– O(logB N+T/B) query

– O(logB N) update

• B-trees with elements in the leaves sometimes called B+-tree

• Construction in I/Os– Sort elements and construct leaves– Build tree level-by-level bottom-up

)(B

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 24: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

24Lars Arge

Generalized B-tree• B-tree with branching parameter b and leaf parameter k (b,k≥8)

– All leaves on same level and contain between 1/4k and k elements– Except for the root, all nodes have degree between 1/4b and b– Root has degree between 2 and b

• B-tree with leaf parameter – O(N/B) space– Height – amortized leaf rebalance operations– amortized internal node rebalance operations

)(logBN

bO)( 1

kO

)log( 1BN

bkbO

)(Bk

I/O-efficient Algorithms and Data Structures

Page 25: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

B-tree Sorting• Normally we can sort N elements in O(N log N) time with search tree

– Insert all elements one-by-one (construct tree)– Output in sorted order using in-order traversal

• Same algorithm using B-tree use I/Os– A factor of non-optimal

• We would like to have dynamic data structure to use in algorithms I/O operations

)log( NNO B

)(log

log

BBM

BO

)log(BN

BMBNO )log( 1

BN

BMBO

I/O-efficient Algorithms and Data Structures

25Lars Arge

Page 26: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

26

• Main idea: Logically group nodes together and add buffers– Insertions done in a “lazy” way – elements inserted in buffers– When a buffer runs full elements are pushed one level down– Buffer-emptying in O(M/B) I/Os

every block touched constant number of times on each level

inserting N elements (N/B blocks) costs I/Os)log(BN

BMBNO

Buffer-tree Technique

B

B

M elements

fan-out M/B)(log

BN

BMO

I/O-efficient Algorithms and Data Structures

Lars Arge

Page 27: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

27Lars Arge

• Definition:– B-tree with branching parameter and leaf parameter B – Size M buffer in each internal node

• Updates:– Add time-stamp to insert/delete element– Collect B elements in memory before inserting in root buffer– Perform buffer-emptying when buffer runs full

Basic Buffer-tree

BM

$m$ blocksMBM

BM ...

41

B

I/O-efficient Algorithms and Data Structures

Page 28: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

28Lars Arge

Basic Buffer-tree• Note:

– Buffer can be larger than M during recursive buffer-emptying* Elements distributed in sorted order

at most M elements in buffer unsorted– Rebalancing needed when “leaf-node” buffer emptied

* Leaf-node buffer-emptying only performed after all full internal node buffers are emptied

$m$ blocksMBM

BM ...

41

B

I/O-efficient Algorithms and Data Structures

Page 29: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

29Lars Arge

Basic Buffer-tree• Internal node buffer-empty:

– Load first M (unsorted) elements into

memory and sort them– Merge elements in memory with rest

of (already sorted) elements– Scan through sorted list while

* Removing “matching” insert/deletes* Distribute elements to child buffers

– Recursively empty full child buffers

• Emptying buffer of size X takes O(X/B+M/B)=O(X/B) I/Os

$m$ blocksMBM

BM ...

41

I/O-efficient Algorithms and Data Structures

Page 30: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

30Lars Arge

Basic Buffer-tree• Buffer-empty of leaf node with K elements in leaves

– Sort buffer as previously– Merge buffer elements with elements in leaves– Remove “matching” insert/deletes obtaining K’ elements– If K’<K then

* Add K-K’ “dummy” elements and insert in “dummy” leaves

Otherwise* Place K elements in leaves* Repeatedly insert block of elements in leaves and rebalance

• Delete dummy leaves and rebalance when all full buffers emptied

K

I/O-efficient Algorithms and Data Structures

Page 31: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

31Lars Arge

Basic Buffer-tree• Invariant:

Buffers of nodes on path from root to emptied leaf-node are empty

• Insert rebalancing (splits)

performed as in normal B-tree

• Delete rebalancing: v’ buffer emptied before fuse of v– Necessary buffer emptyings performed before next dummy-

block delete– Invariant maintained

v vv’

v v’ v’’

I/O-efficient Algorithms and Data Structures

Page 32: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

32Lars Arge

Basic Buffer-tree• Analysis:

– Not counting rebalancing, a buffer-emptying of node with X ≥ M elements (full) takes O(X/B) I/Os

total full node emptying cost I/Os– Delete rebalancing buffer-emptying (non-full) takes O(M/B) I/Os

cost of one split/fuse O(M/B) I/Os– During N updates

* O(N/B) leaf split/fuse* internal node split/fuse

Total cost of N operations: I/Os

)log(BN

BM

BM

BN

O

)log(BN

BN

BMO

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 33: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

33Lars Arge

Basic Buffer-tree• Emptying all buffers after N insertions:

Perform buffer-emptying on all nodes in BFS-order

resulting full-buffer emptyings cost I/Os

empty non-full buffers using O(M/B) O(N/B) I/Os

• N elements can be sorted using buffer tree in I/Os

)log(BN

BN

BMO

)(B

MB

N

O

$m$ blocksMBM

BM ...

41

B

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 34: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

34Lars Arge

• Batching of operations on B-tree using M-sized buffers– I/O updates amortized– All buffers emptied in I/Os

• One-dim. rangesearch operations can also be supported in

I/Os amortized– Search elements handle lazily like updates– All elements in relevant sub-trees

reported during buffer-emptying– Buffer-emptying in O(X/B+T’/B),

where T’ is reported elements

Buffer-tree Summary

)log( 1BN

BMBO

)log( 1BT

BN

BMBO

$m$ blocks

)log(BN

BN

BMO

I/O-efficient Algorithms and Data Structures

Page 35: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

35Lars Arge

Orthogonal Line Segment Intersection

• Sweep plane top-down while maintaining search tree T on vertical segments crossing sweep line (by x-coordinates)– Top endpoint of vertical segment: Insert in T– Bottom endpoint of vertical segment: Delete from T– Horizontal segment: Perform range query with x-interval on T

I/Os using buffer-tree

I/O-efficient Algorithms and Data Structures

)log(BT

BN

BN

BMO

Page 36: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

36Lars Arge

• Basic buffer tree can be used in external priority queue• To delete minimal element:

– Empty all buffers on leftmost path– Delete elements in leftmost

leaf and keep in memory– Deletion of next M minimal

elements free– Inserted elements checked against

minimal elements in memory

• I/Os every O(M) delete amortized

Buffered Priority Queue

)log(BN

BMBMO

M41

)log( 1BN

BMBO

)(BM

B

I/O-efficient Algorithms and Data Structures

Page 37: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

37Lars Arge

List Ranking• Problem:

– Given N-vertex linked list stored in array– Compute rank (number in list) of each vertex

• One of the simplest graph problem one can think of

• Straightforward O(N) internal algorithm– Also uses O(N) I/Os in external memory

• Much harder to get external algorithm

3 4 5 9 68 27 101 5 2 6 73 4 108 9

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 38: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

38Lars Arge

List Ranking• We will solve more general problem:

– Given N-vertex linked list with edge-weights stored in array– Compute sum of weights (rank) from start for each vertex

• List ranking: All edge weights one

• Note: Weight stored in array entry together with edge (next vertex)

1 5 2 6 73 4 108 9

1 1 11 111 1

1 1

I/O-efficient Algorithms and Data Structures

Page 39: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

39Lars Arge

List Ranking

• Algorithm:

1. Find and mark independent set of vertices

2. “Bridge-out” independent set: Add new edges

3. Recursively rank resulting list

4. “Bridge-in” independent set: Compute rank of independent set

• Step 1, 2 and 4 in I/Os• Independent set of size αN for 0 < α ≤ 1

I/Os

11 111 1 1 11 12 2 2

1 3 4 6 8 9 102 5 7

)log(BN

BMBNO

)log()log())1(()(BN

BMBN

BN

BMBN OONTNT

I/O-efficient Algorithms and Data Structures

Page 40: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

40Lars Arge

List Ranking: Bridge-out/in

• Obtain information (edge or rang) of successor – Make copy of original list– Sort original list by successor id– Scan original and copy together to obtain successor information– Sort modified original list by id

I/Os

11

23 4 5 9 68 27 102 3 4 95 86 7 103 4 5 9 68 27 103 4 8 9 627 10

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 41: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

41Lars Arge

List Ranking: Independent Set• Easy to design randomized algorithm:

– Scan list and flip a coin for each vertex– Independent set is vertices with head and successor with tails

Independent set of expected size N/4

• Deterministic algorithm:– 3-color vertices (no vertex same color as predecessor/successor) – Independent set is vertices with most popular color

Independent set of size at least N/3

• 3-coloring I/O algorithm)log(BN

BMBNO )log(

BN

BMBNO

)log(BN

BMBNO

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 42: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

42Lars Arge

List Ranking: 3-coloring• Algorithm:

– Consider forward and backward lists (heads/tails in two lists)– Color forward lists (except tail) alternately red and blue– Color backward lists (except tail) alternately green and blue

3-coloring

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 43: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

43Lars Arge

List Ranking: Forward List Coloring• Identify heads and tails• For each head, insert red element in priority-queue (priority=position)• Repeatedly:

– Extract minimal element from queue– Access and color corresponding element in list– Insert opposite color element corresponding to successor in queue

• Scan of list• O(N) priority-queue operations

I/Os

`3 4 5 9 68 27 10

)log(BN

BMBNO

I/O-efficient Algorithms and Data Structures

Page 44: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

44Lars Arge

List Ranking Summary• Simplest graph problem: Traverse linked list

• Very easy O(N) algorithm in internal memory• Much more difficult external memory

– Finding independent set via 3-coloring– Bridging vertices in/out

• Permuting bound best possible– Also true for other graph problems

)log(BN

BMBNO

})log,(min{BN

BN

BMNO

3 4 5 9 68 27 101 5 2 6 73 4 108 9

I/O-efficient Algorithms and Data Structures

Page 45: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

45Lars Arge

List Ranking Summary• External list ranking algorithm similar to PRAM algorithm

– Sometimes external algorithms by “PRAM algorithm simulation”

• Forward list coloring algorithm example of “time forward processing”– Use external priority-queue to send information “forward in time”

to vertices to be processed later

• PRAM algorithm ideas and time forward processing used in many graph algorithms

3 4 5 9 68 27 10

I/O-efficient Algorithms and Data Structures

Page 46: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

References• External Memory Geometric Data Structures (section 1-5)

Lecture notes by L. Arge

• I/O-Efficient Graph Algorithms (section 1-2)

Lecture notes by N. Zeh

• Cache-Oblivious Algorithms. M. Frigo, C. Leiserson, H. Prokop, and S. Ramachandran. Proc. FOCS '99.

Lars Arge 46

I/O-efficient Algorithms and Data Structures

Page 47: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Summary• We discussed

– Sorting– Hardness of permuting (sorting)– Batched geometric problems– Searching: B-trees, buffer-trees, priority-queue– Graph algorithms: List-ranking– Cache-oblivious algorithms

• Important techniques– Multiway merging/distribution– Distribution sweeping– Multiway-trees, buffered data structures– Time-forward processing (and PRAM simulation)

Lars Arge 47

I/O-efficient Algorithms and Data Structures

Page 48: I/O Efficient Algorithms and Data Structures 3-4 (Lecture by Lars Arge)

Thanks

[email protected]

www.madalgo.au.dk