design and analysis ofdesign and analysis of algorithms...

47
2013/3/6 1 Design and Analysis of Design and Analysis of Algorithms 演算法設計與分析 Lecture 4 1 March 13, 2013 洪國寶 Homework # 2 1. Given a set S of n integers and another integer x, determines whether or not there exist two elements in S whose sum is exactly x. a. By trying all pairs of integers in S, describe a O(n 2 )-time algorithm to solve this problem You need to outline your idea give an example prove this problem. You need to outline your idea, give an example, prove correctness, and analyze time complexity. b. Assume S is sorted. Show how to solve this problem in O(n)-time. 2. 4.5-4 (p. 97) Can the master method be applied to the recurrence T(n) = 4T(n/2) + n 2 lg n? Why or why not? Give an asymptotic upper bound for this recurrence. 3. 4.1-5 (p.75) Use the following ideas to develop a nonrecursive, linear time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right, keeping track of the subarray seen so far Knowing a maximum subarray of A[1 j] extend 2 subarray seen so far. Knowing a maximum subarray of A[1..j], extend the answer to find a maximum subarray ending at index j+1 by using the following observation: a maximum subarray of A[1..j+1] is either a maximum subarray of A[1..j] or a subarray A[i..j+1], for some 1i j+1. Determine a maximum subarray of A[i..j+1] in constant time based on knowing a maximum subarray ending at index j+1. 4. 10-1 (p. 217) / 10-1 (p. 249) Due March 27, 2013

Upload: others

Post on 08-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

1

Design and Analysis ofDesign and Analysis of Algorithms

演算法設計與分析

Lecture 4

1

March 13, 2013

洪國寶

Homework # 21. Given a set S of n integers and another integer x, determines whether or

not there exist two elements in S whose sum is exactly x.a. By trying all pairs of integers in S, describe a O(n2)-time algorithm to solve

this problem You need to outline your idea give an example provethis problem. You need to outline your idea, give an example, prove correctness, and analyze time complexity.

b. Assume S is sorted. Show how to solve this problem in O(n)-time. 2. 4.5-4 (p. 97) Can the master method be applied to the recurrence T(n) =

4T(n/2) + n2lg n? Why or why not? Give an asymptotic upper bound for this recurrence.

3. 4.1-5 (p.75) Use the following ideas to develop a nonrecursive, linear time algorithm for the maximum-subarray problem. Start at the left end of the array, and progress toward the right, keeping track of the subarray seen so far Knowing a maximum subarray of A[1 j] extend

2

subarray seen so far. Knowing a maximum subarray of A[1..j], extend the answer to find a maximum subarray ending at index j+1 by using the following observation: a maximum subarray of A[1..j+1] is either a maximum subarray of A[1..j] or a subarray A[i..j+1], for some 1≦ i ≦j+1. Determine a maximum subarray of A[i..j+1] in constant time based on knowing a maximum subarray ending at index j+1.

4. 10-1 (p. 217) / 10-1 (p. 249)Due March 27, 2013

Page 2: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

2

Outline

• Review• Review

• Hash table

• Binary search trees

• Red-black trees

A ti d t t t

3

• Augmenting data structures

Review: Recurrence Relations• Describe functions in terms of their values on

ll ismaller inputs• Arise from Divide and Conquer

T(n) = (1) if n cT(n) = a T(n/b) + D(n)+C(n) otherwise

• Solution Methods (Chapter 4)

4

( p )– Substitution Method– Iteration Method– Master Method ■

Page 3: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

3

Review: Substitution Method

• Guess the form of the solution, then use ,mathematical induction to show that it works

• Works well when the solution is easy to guess

5

• No general way to guess the correct solution ■

Review: Iteration Method

• Expand (iterate) the recurrence and express it as a p ( ) psummation of terms dependent only on n and the initial conditions

• The key is to focus on 2 parameters– the number of times the recurrence needs to be iterated to

reach the boundary conditionh f i i f h l l f h i i

6

– the sum of terms arising from each level of the iteration process ■

Page 4: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

4

Review: Master Method

• Provides a “cookbook” method for solving grecurrences of the form

T(n) = a T(n/b) + f(n)

• Assumptions:– a 1 and b 1 are constants

7

– f(n) is an asymptotically positive function

– T(n) is defined for nonnegative integers

– We interpret n/b to mean either n/b or n/b■

Review: The Master Theorem• With the recurrence T(n) = a T(n/b) + f(n) as in the

i lid T( ) b b d d i llprevious slide, T(n) can be bounded asymptotically as follows:

1. If f(n)=O(nlogba-) for some constant > 0, then T(n)= (nlogba).

2. If f(n) = (nlogba), then T(n) = (nlogba lg n).

8

3. If f(n) = ( nlogba+ ) for some constant > 0, and if a f(n/b) c f(n) for some constant c < 1and all sufficiently large n, then T(n)= (f(n)). ■

Page 5: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

5

Review: Simplified Master Theorem

Let a 1 and b > 1 be constants and let T(n) be the recurrence

T(n) = a T(n/b) + c nk

defined for n 0.

1. If a > bk, then T(n) = ( nlogba ).

2 If a = bk then T(n) = ( nk lg n )

9

2. If a b , then T(n) ( n lg n ).

3. If a < bk, then T(n) = ( nk ). ■

Review: Dynamic Sets

• Sets that grow, shrink, or otherwise change with time are Sets t at g ow, s , o ot e w se c a ge w t t e a ecalled dynamic sets.

• Each element in the set is represented by a data object. • Usually one of the fields of an object is called a key, and

plays a central role in the manipulation of the set data. • Operations on a dynamic set generally fall into two

categories,

10

1. queries that return information about the set, 2. modifying operations that change the set.

Page 6: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

6

Review: Dynamic Sets

• Dynamic sets support queries such as:• Dynamic sets support queries such as:• Search(S, k), Minimum(S), Maximum(S),

Successor(S, x), Predecessor(S, x)

• Dynamic sets support modifying operationslike:

I (S ) D l (S )

11

• Insert(S, x), Delete(S, x)

Review: Elementary Data Structures

• Stacks: push (insert) pop (delete)• Stacks: push (insert), pop (delete)

• Queues: enqueue (insert), dequeue (delete)

• Linked lists: insert, delete, search

• Rooted trees

12

Page 7: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

7

Outline

• Review• Review

• Hash table

• Binary search trees

• Red-black trees

A ti d t t t

13

• Augmenting data structures

Hashing Tables

• Motivation: symbol tables• Motivation: symbol tables– A compiler uses a symbol table to relate

symbols to associated data• Symbols: variable names, procedure names, etc.

• Associated data: memory location, call graph, etc.

F b l t bl ( l ll d di ti )

14

– For a symbol table (also called a dictionary), we care about search, insertion, and deletion

– We typically don’t care about sorted order ■

Page 8: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

8

Hash Tables

• More formally:More formally:– Given a table T and a record x, with key (= symbol) and

satellite data, we need to support:• Insert (T, x)• Delete (T, x)• Search(T, x)

– We want these to be fast, but don’t care about sorting

15

the records

• The structure we will use is a hash table– Supports all the above in O(1) expected time! ■

Hashing: Keys

• In the following discussions we will• In the following discussions we will consider all keys to be (possibly large) natural numbers ■

16

Page 9: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

9

Direct Addressing• Suppose:

– The range of keys is 0..m-1

– Keys are distinct

• The idea:– Set up an array T[0..m-1] in which

• T[i] = x if x T and key[x] = i

• T[i] = NULL otherwise

17

• T[i] = NULL otherwise

– This is called a direct-address table• Operations take O(1) time!

• So what’s the problem? ■

Problems With Direct Addressing

• Direct addressing works well when the range m of• Direct addressing works well when the range m of keys is relatively small

• But what if the keys are 32-bit integers?– Problem 1: direct-address table will have

232 entries, more than 4 billion

– Problem 2: even if memory is not an issue, the time to

18

y ,initialize the elements to NULL may be

• Solution: map keys to smaller range 0..m-1

• This mapping is called a hash function ■

Page 10: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

10

Hash Functions

• Next problem: collision• Next problem: collision T

0

h(k1)

h(k4)k4

k1

k5

U(universe of keys)

K(actual

19m - 1

h(k2) = h(k5)

h(k3)k2 k3

5(actualkeys)

Resolving Collisions

• How can we solve the problem of collisions?• How can we solve the problem of collisions?

• Solution 1: chaining

• Solution 2: open addressing ■

20

Page 11: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

11

Open Addressing

• Basic idea (details in Section 11 4):Basic idea (details in Section 11.4): – To insert: if slot is full, try another slot, …, until an

open slot is found (probing)– To search, follow same sequence of probes as would be

used when inserting the element• If reach element with correct key, return it• If reach a NULL pointer, element is not in table

21

• Good for fixed sets (adding but no deletion, why?)– Example: spell checking

• Table needn’t be much bigger than n ■

Chaining• Chaining puts elements that hash to the same

l t i li k d li tslot in a linked list:

——

——

——

——

T

k4

k1

k5

U(universe of keys)

K

k1 k4 ——

22——

——k2

k3

5(actualkeys)

k6k8

k7 k5 k2

k3

k8 k6 ——

——

k7 ——

Page 12: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

12

Chaining

• How do we insert an element?ow do we inse t an element?

——

——

——

——

T

k4

k1

k5

U(universe of keys)

K

k1 k4 ——

23——

——k2

k3

5(actualkeys)

k6k8

k7 k5 k2

k3

k8 k6 ——

——

k7 ——

Chaining• How do we delete an element?

Do we need a doubly linked list for efficient delete?

——

——

——

——

T

k4

k1

k5

U(universe of keys)

K

k1 k4 ——

– Do we need a doubly-linked list for efficient delete?

24——

——k2

k3

5(actualkeys)

k6k8

k7 k5 k2

k3

k8 k6 ——

——

k7 ——

Page 13: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

13

Chaining• How do we search for a element with a

i k ?given key?

——

——

——

——

T

k4

k1

k5

U(universe of keys)

K

k1 k4 ——

25——

——k2

k3

5(actualkeys)

k6k8

k7 k5 k2

k3

k8 k6 ——

——

k7 ——

Collision resolution by chaining

• The dictionary operations on a hash table T are• The dictionary operations on a hash table T are easy to implement when collisions are resolved by chaining. – CHAINED-HASH-INSERT(T,x)

insert x at the head of list T[h(key[x])]

– CHAINED-HASH-SEARCH(T,k)

26

( , )

search for an element with key k in list T[h(k)]

– CHAINED-HASH-DELETE(T,x)

delete x from the list T[h(key[x])] ■

Page 14: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

14

Collision resolution by chaining

• The worst-case running time for insertion is O(1)The worst case running time for insertion is O(1). • For searching, the worst-case running time is

proportional to the length of the list; we shall analyze this more closely below.

• Deletion of an element x can be accomplished in O(1) time if the lists are doubly linked.

27

– If the lists are singly linked, we must first find x in the list T[h(key[x])], so that the next link of x's predecessor can be properly set to splice x out; in this case, deletion and searching have essentially the same running time. ■

Analysis of Chaining

• Assume simple uniform hashing: each key• Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot

• Given n keys and m slots in the table: the load factor = n/m = average # keys per

28

slot ■

Page 15: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

15

Analysis of Chaining

• What will be the average cost of an unsuccessful• What will be the average cost of an unsuccessful search for a key? A: O(1+) (intuitively, the expected length of the link list)

• What will be the average cost of a successful search? A: O(1 + /2) = O(1 + ) (intuitively, half the expected length of the link list)

29

the expected length of the link list)

• Q: What is the difference between an unsuccessful search and a successful search?

Analysis of Chaining

• Theorem 11 1• Theorem 11.1

In a hash table in which collisions are resolved by chaining, an unsuccessfulsearch takes time O(1 + α ), on the average, under the assumption of simple uniform

30

hashing.

• Proof: (Use blackboard)

Page 16: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

16

Analysis of Chaining

• Theorem 11 2• Theorem 11.2

In a hash table in which collisions are resolved by chaining, a successful search takes time O(1 + α/2), on the average, under the assumption of simple uniform

31

hashing.

• Proof: (Use blackboard)

Analysis of Chaining Continued

• So the cost of searching = O(1 + )• So the cost of searching = O(1 + )

• If the number of hash-table slots is at least proportional to the number of elements in the table, we have n = O(m). – = n/m = O(m)/m = O(1)

32

( ) ( )

– In other words, we can make the expected cost of searching constant if we make constant ■

Page 17: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

17

Choosing A Hash Function

• Clearly choosing the hash function well is• Clearly choosing the hash function well is crucial– What will a worst-case hash function do?

– What will be the time to search in this case?

• What are desirable features of the hash

33

f ffunction?– Should distribute keys uniformly into slots

– Should not depend on patterns in the data ■

Hash Functions:The Division Method

• h(k) = k mod mh(k) k od m– In words: hash k into a table with m slots using the slot

given by the remainder of k divided by m• What happens to elements with adjacent values of k?• What happens if m is a power of 2 (say 2P)?• What if m is a power of 10?• It is better to make the hash function depend

34

• It is better to make the hash function depend on all the bits (or digits) of the key – Pick table size m = prime number not too close to a power

of 2 (or 10) ■

Page 18: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

18

Hash Functions:The Multiplication Method

• For a constant A 0 < A < 1:• For a constant A, 0 < A < 1:

• h(k) = m (kA - kA)

What does this term represent?

35

Hash Functions:The Multiplication Method

• For a constant A 0 < A < 1:• For a constant A, 0 < A < 1:

• h(k) = m (kA - kA)

Ch 2P

Fractional part of kA

36

• Choose m = 2P

• Choose A not too close to 0 or 1

• Knuth: Good choice for A = (5 - 1)/2

Page 19: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

19

Hash Functions: Worst Case Scenario

• It is always possible to analyzes a hash• It is always possible to analyzes a hash function and pick a sequence of “worst-case” keys that all hash to the same slot, yielding an average retrieval time (n).

• What’s can we do?

37

Hash Functions: Universal Hashing

• To foil such malicious adversaries:• To foil such malicious adversaries: randomize the algorithm

• Universal hashing: pick a hash function randomly in a way that is independent of the keys that are actually going to be stored

38

– Guarantees good performance on average, no matter what keys adversary chooses

– Details in 11.3.3 ■

Page 20: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

20

Bloom filter

• A Bloom filter is a bit string• A Bloom filter is a bit string.• n hash functions that map the data into n

bits in the Bloom filter.• Whenever you have a set or list, and space

is an issue, a Bloom filter may be a useful lt ti

39

alternative. • http://en.wikipedia.org/wiki/Bloom_filter■

Outline

• Review• Review

• Hash table

• Binary search trees

• Red-black trees

A ti d t t t

40

• Augmenting data structures

Page 21: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

21

Binary Search Trees

• Binary Search Trees (BSTs) are binary trees with• Binary Search Trees (BSTs) are binary trees with binary search tree property:

key[leftSubtree(x)] key[x] key[rightSubtree(x)]

• In addition to satellite data, elements in a BST have:– key: an identifying field inducing a total ordering

l ft: pointer to a left child (ma be NULL)

41

– left: pointer to a left child (may be NULL)

– right: pointer to a right child (may be NULL)

– p: pointer to a parent node (NULL for root) ■

Binary Search Trees

• Examples:• Examples:

42

Page 22: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

22

Inorder Tree Walk

• What does the following code do?What does the following code do?TreeWalk(x)if x Nil

then TreeWalk(left[x]);print(x);TreeWalk(right[x]);

• A: prints elements in sorted (increasing) order

43

p ( g)• This is called an inorder tree walk

– Preorder tree walk: print root, then left, then right– Postorder tree walk: print left, then right, then root ■

Inorder Tree Walk

• Example:F

Example:

• How long will a tree walk take? What is the

B H

KDA

44

• How long will a tree walk take? What is the recurrence T(n)?

• Prove that inorder walk prints in monotonically increasing order.

Page 23: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

23

Operations on BSTs: Search

• Given a key k• Given a key kand a pointer to a node x, returns an element with that key or

45

yNULL:

Operations on BSTs: Search

• Example:• Example:

46

Page 24: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

24

Operations on BSTs: Search

• How long will a search operation take?• How long will a search operation take?

• T(n) = ?

47

Operations of BSTs: Insert

• Adds an element z to the tree so that the• Adds an element z to the tree so that the binary search tree property continues to hold

• The basic algorithm– Like the search procedure above

48

p

– Insert z in place of NULL ■

Page 25: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

25

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

49

BST Insert: Example

• Example: Insert C• Example: Insert CF

B H

KDA

50

KDA

C

Page 26: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

26

BST Search/Insert: Running Time• What is the running time of TreeSearch( ) or

( )?TreeInsert( )?

• A: O(h), where h = height of tree

• What is the height of a binary search tree?

• A: worst case: h = O(n) when tree is just a li t i f l ft i ht hild

51

linear string of left or right children– We’ll keep all analysis in terms of h for now

– Later we’ll see how to maintain h = O(lg n)

Sorting again

• Sorting With Binary Search Trees (see the next slide for an example)So t g W t a y Sea c ees (see e e s de o e p e)

Informal code for sorting array A of length n:BSTSort(A)

for i=1 to nTreeInsert(A[i]);

InorderTreeWalk(root);• Argue that this is (n log n)

52

Argue that this is (n log n)• What will be the running time in the

– Worst case? – Average case?

Page 27: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

27

Sorting With BSTsfor i=1 to n

TreeInsert(A[i]);InorderTreeWalk(root);

3 1 8 2 6 7 5

1 2 8 6 7 5

3

1 8

53

5 7

2 6 7 5 2 6

5 7

More BST Operations: Minimum, Maximum

• How can we implement• How can we implement Minimum and Maximum queries?

• What are their running time?

54

time?

Page 28: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

28

More BST Operations: Successor and Predecessor

• For deletion, we will need a Successor() operation.• Wh t th l l f fi di th• What are the general rules for finding the

successor of node x?• Two cases:

– x has a right subtree: successor is minimum node in right subtree

– x has no right subtree: successor is first ancestor of xh l f hild i l f

55

whose left child is also ancestor of x• Intuition: As long as you move to the left up the tree, you’re

visiting smaller nodes.

• Predecessor: similar algorithmTreeWalk(x)

TreeWalk(left[x]);print(x);TreeWalk(right[x]);

BST Operations: Successor

56

Page 29: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

29

BST Operations: Successor• Example:

What is the successor of node 3? Node 15? Node 13?

57

BST Operations: Delete

• Deletion is a bit tricky• Deletion is a bit tricky• 3 cases:

– (0). x has no children: • Remove x

– (1). x has one child: • Splice out x

58

• Splice out x

– (2). x has two children: • Swap x with its successor y• Perform case (0) or (1) to delete y

Page 30: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

30

BST Operations: Delete

• Why will case 2 always go to case 0 or case 1?• Why will case 2 always go to case 0 or case 1?

• A: because when x has 2 children, its successor is the minimum in its right subtree

• Could we swap x with predecessor instead of successor?

59

?

• A: yes.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

60

Page 31: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

31

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

61

Binary search trees: conclusion

• Search trees are data structures that support many• Search trees are data structures that support many dynamic-set operations.

• Basic operations on a binary search tree take time propositional to the height of the tree. – The height of a node in a tree is the number of edges on

the longest simple downward path from the node to a

62

g p pleaf.

– The height of a tree is the height of its root.

• Up next: guaranteeing a O(log n) height tree ■

Page 32: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

32

Outline

• Review• Review

• Hash table

• Binary search trees

• Red-black trees

A ti d t t t

63

• Augmenting data structures

Red-Black Trees

• Red-black trees:• Red-black trees:– Binary search trees augmented with node color

– Operations designed to guarantee that the heighth = O(lg n)

• First: describe the properties of red-black trees

Then: prove that these guarantee h = O(lg n)

64

Then: prove that these guarantee h O(lg n)

Finally: describe operations on red-black trees ■

Page 33: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

33

Red-Black Properties

• The red-black properties:The red black properties:1. Every node is either red or black2. Every leaf (NULL pointer) is black

– Note: this means every “real” node has 2 children

3. If a node is red, both children are black– Note: can’t have 2 consecutive reds on a path

65

p

4. Every path from node to descendent leaf contains the same number of black nodes

5. The root is always black ■

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

66

Page 34: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

34

Height of Red-Black Trees

• We call the number of black nodes on any path• We call the number of black nodes on any path from, but not including, a node x to a leaf theblack-height of the node, denoted bh(x).

• What is the minimum black-height of a node with height h?

A: a height h node has black height h/2

67

A: a height-h node has black-height h/2

• Theorem: A red-black tree with n internal nodes has height h 2 log(n + 1) ■

RB Trees: Proving Height Bound

• Prove: n-node RB tree has height h 2 lg(n+1)• Prove: n-node RB tree has height h 2 lg(n+1)

• Claim: A subtree rooted at a node x contains at least 2bh(x) - 1 internal nodes– Proof by induction on height h

– Base step: x has height 0 (i.e., NULL leaf node)• What is bh(x)?

68

What is bh(x)?

Page 35: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

35

RB Trees: Proving Height Bound

• Prove: n-node RB tree has height h 2 lg(n+1)Prove: n node RB tree has height h 2 lg(n 1)• Claim: A subtree rooted at a node x contains at

least 2bh(x) - 1 internal nodes- Proof by induction on height h - Base step: x has height 0 (i.e., NULL leaf node)What is bh(x)?A: 0

69

A: 0So…subtree contains 2bh(x) - 1

= 20 - 1 = 0 internal nodes (TRUE)

RB Trees: Proving Height Bound• Inductive proof that subtree at node x contains at least

2bh(x) - 1 internal nodes- Inductive step: x has positive height and 2 children

- Each child has black-height of bh(x) or bh(x)-1 (Why?)

- The height of a child = (height of x) - 1- So the subtrees rooted at each child contain at least 2bh(x) - 1 - 1 internal nodes

- Thus subtree at x contains

70

- Thus subtree at x contains (2bh(x) - 1 - 1) + (2bh(x) - 1 - 1) + 1

= 2•2bh(x)-1 - 1 = 2bh(x) - 1 nodes ■

Page 36: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

36

RB Trees: Proving Height Bound

• Thus at the root of the red black tree:• Thus at the root of the red-black tree:

n 2bh(root) - 1

n 2h/2 - 1

log(n+1) h/2

h 2 l ( + 1)

71

h 2 log(n + 1)

Thus h = O(log n) ■

RB Trees: Worst-Case Time

• So we’ve proved that a red-black tree hasSo we ve proved that a red black tree has O(log n) height

• Corollary: These operations take O(log n) time: – Minimum(), Maximum()– Successor(), Predecessor()– Search()

72

• Insert() and Delete():– Will also take O(log n) time– But will need special care since they modify tree ■

Page 37: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

37

Red-Black Trees: An Example

• Color this tree:77

• Color this tree:5 9

1212

5 9

Red-black properties:1. Every node is either red or black

73

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

• Insert 8

Red-Black Trees: The Problem With Insertion

7• Insert 8

– Where does it go?12

5 9

1 Every node is either red or black

74

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Page 38: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

38

• Insert 8 (Cont )

Red-Black Trees: The Problem With Insertion

7• Insert 8 (Cont.)

– Where does it go?

– What color should it be?

12

5 9

8

1 Every node is either red or black

75

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

• Insert 8 (Cont )

Red-Black Trees: The Problem With Insertion

7• Insert 8 (Cont.)

– Where does it go?

– What color should it be?

12

5 9

8

1 Every node is either red or black

76

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Page 39: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

39

Red-Black Trees:The Problem With Insertion

• Insert 117

• Insert 11– Where does it go?

1 Every node is either red or black

12

5 9

8

77

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Red-Black Trees:The Problem With Insertion

• Insert 11 (Cont )7

• Insert 11 (Cont.) – Where does it go?

– What color?

1 Every node is either red or black

12

5 9

8

11

78

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Page 40: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

40

Red-Black Trees:The Problem With Insertion

• Insert 11 (Cont )7

• Insert 11 (Cont.) – Where does it go?

– What color?• Can’t be red! (#3)

1 Every node is either red or black

12

5 9

8

11

79

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Red-Black Trees:The Problem With Insertion

• Insert 11 (Cont )7

• Insert 11 (Cont.) – Where does it go?

– What color?• Can’t be red! (#3)

• Can’t be black! (#4)

12

5 9

8

11

80

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Page 41: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

41

Red-Black Trees:The Problem With Insertion

• Insert 11 (Cont )7

• Insert 11 (Cont.) – Where does it go?

– What color?• Solution:

recolor the tree

12

5 9

8

11

81

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Red-Black Trees:The Problem With Insertion

• Insert 11 (Cont )7

• Insert 11 (Cont.) – Where does it go?

– What color?• Solution:

recolor the tree

12

5 9

8

11

82

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Page 42: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

42

Red-Black Trees:The Problem With Insertion

• Insert 107

• Insert 10– Where does it go?

1 Every node is either red or black

12

5 9

8

11

83

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

Red-Black Trees:The Problem With Insertion

• Insert 10 (Cont )7

• Insert 10 (Cont.) – Where does it go?

– What color?

1 Every node is either red or black

12

5 9

8

11

84

1. Every node is either red or black2. Every leaf (NULL pointer) is black3. If a node is red, both children are black4. Every path from node to descendent leaf

contains the same number of black nodes5. The root is always black

10

Page 43: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

43

Red-Black Trees:The Problem With Insertion

• Insert 10 (Cont )7

• Insert 10 (Cont.) – Where does it go?– What color?

• A: no color! Tree is too imbalanced

• Must change tree structure

12

5 9

8

11

85

to allow re-coloring

– Goal: restructure tree in O(log n) time

10

RB Trees: Rotation

• Our basic operation for changing tree structure is called rotation:called rotation:

y

x C

A B

x

A y

B C

rightRotate(y)

leftRotate(x)

86

• Does rotation preserve inorder key ordering?• What would the code for rightRotate()

actually do?

A B B C

Page 44: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

44

rightRotate(y)

RB Trees: Rotation

y xrightRotate(y)

• Answer: A lot of pointer manipulation– x keeps its left child– y keeps its right child

x C

A B

A y

B C

87

y keeps its right child– x’s right child becomes y’s left child– x’s and y’s parents change

• What is the running time?

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

88

Page 45: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

45

Red-Black Trees: Insertion

• Insertion: the basic idea• Insertion: the basic idea– Insert x into tree, color x red

– Only r-b property 3 might be violated (if p[x] red)

• If so, move violation up tree until a place is found where it can be fixed (by recoloring nodes and

89

where it can be fixed (by recoloring nodes and performing rotations)

– Total time will be O(log n)

Red-Black Trees: Insertion

• There are actually six cases to consider, but three of them e e a e actua y s cases to co s de , but t ee o t eare symmetric to the other three, depending on whether x's parent p[x] is a left child or a right child of x's grandparent p[p[x]].

• We consider the situation in which p[x] is a left child. – Case 1: the color of x's parent's sibling, or "uncle” y is red– Case 2: the color of x's uncle y is black x is a right child of p[x].

C 3 th l f ' l i bl k i l ft hild f [ ]

90

– Case 3: the color of x's uncle y is black x is a left child of p[x].

• Figure 13.4

Page 46: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

46

91

Red-Black Trees: Insertion

• We will not cover the details in class.– You should read section 13.3 on your own ■

• http://www youtube com/watch?gl=TW&

92

• http://www.youtube.com/watch?gl=TW&hl=zh-TW&v=vDHFF4wjWYU

Page 47: Design and Analysis ofDesign and Analysis of Algorithms ...ailab.cs.nchu.edu.tw/course/Algorithms/101-2/AL04.pdf2013/3/6 1 Design and Analysis ofDesign and Analysis of Algorithms 演算法設計與分析

2013/3/6

47

Red-Black Trees: Deletion

• We will not cover RB delete in class either• We will not cover RB delete in class either.– You should read section 13.4 on your own

– Read for the overall picture, not the details ■

93

Balancing a binary search tree

• 1962 Adel’son Vel’skii and Landis AVL• 1962 Adel son-Vel skii and Landis AVL tree (problem 13-3)

• 1970 Hopcroft 2-3 trees (B-trees Chapter 18)

• 1972 Bayer Red-black tree

• 1983 Sleator and Tarjan Splay trees

94

• 1983 Sleator and Tarjan Splay trees(amortized cost)

• 1990 Pugh SkipLists (probabilistic) ■