divide and conquer: intro principle behind d&cdjmoon/algs-ug/algs-notes/algs-c5.pdf · divide...

21
Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances of same problem 2. Solve smaller instances 3. Combine solutions Most common number of subproblems is 2 D&C not guaranteed to be more efficient than Brute Force In general: Problem of size n divided into a subproblems Each subproblem of size n/b (a < n/b) If n = k b , then T (n)= aT (n/b)+ f (n) f (n) represents work required for subdivision and recombination 1

Upload: buianh

Post on 07-May-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Intro

• Principle behind D&C:

1. Divide problem into several smaller instances of same problem

2. Solve smaller instances

3. Combine solutions

• Most common number of subproblems is 2

• D&C not guaranteed to be more efficient than Brute Force

• In general:

– Problem of size n divided into a subproblems

– Each subproblem of size n/b (a < n/b)

– If n = kb, then T (n) = aT (n/b) + f(n)

– f(n) represents work required for subdivision and recombination

1

Page 2: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Master Theorem

• The Master Theorem:

Let t(n) be eventually non-decreasing and satisfyt(n) = at(n/b) + f(n), n = bk, k = 1, 2, ...t(1) = c, where a ≥ 1, b ≥ 2, c > 0

If f(n) ∈ Θ(nd) where d ≥ 0 then,

t(n) ∈

Θ(nd) if a < bd

Θ(ndlogn) if a = bd

Θ(nlogba) if a > bd

• Proof (see pp 428-429)

2

Page 3: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Intro Ex

• Problem: Add n numbers a0, a1, ..., an−1

• Letn−1∑i=0

ai =bn/2c−1∑

i=0

ai +n−1∑

i=bn/2cai

• Analysis:

– Basic Op = addition

– A(n) = 2A(n/2) + 1

– Using the Master Theorem,

∗ a = 2, b = 2, f(n) = 1, d = 0

∗ a > bd

∗ A(n) ∈ Θ(nlogba) = Θ(n)

3

Page 4: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Merge Sort

• Problem: Given array A[0..n− 1], sort

• General approach:

1. Divide into A[0..bn/2c − 1], A[bn/2c..n− 1]

2. Sort each subarray

3. Merge results

• Algorithm:

Alg MS(A[0 .. n - 1]) {

if (n > 1) {

copy A[0 .. floor(n/2) - 1 ] to B[0 .. floor(n/2) - 1 ]

copy A[floor(n/2) .. n - 1] to C[0 .. ceiling(n/2) - 1 ]

MS(B[0..floor(n/2) - 1])

MS(C[0..ceiling(n/2) - 1])

merge(B, C, A)

}

}

• Merging algorithm:

Alg Merge(B[0 .. p - 1], C[0 .. q - 1], A[0 .. p + q - 1]) {

i <- 0

j <- 0

k <- 0

while (i < p and j < q) {

if (B[i] <= C[j]) {

A[k] <- B[i]

i++

}

else {

A[k] <- C[j]

j++

}

k++

}

if (i = p)

copy C[j .. q - 1] to A[k .. p + q - 1]

else

copy B[i .. p - 1] to A[k .. p + q - 1]

}

4

Page 5: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Merge Sort (2)

• Merge analysis:

– Basic operation: comparison

– 1 comparison per iteration

– Each iteration reduces number of elements to be processed by 1

– Worst case when one array empty, other has single element:Cmerge,w(n) = n− 1

• Merge sort analysis:

– C(n) = 2C(n/2) + Cmerge(n), n > 1

– C(1) = 0

– Cw(n) = 2Cw(n/2) + n− 1, n > 1

– Cw(1) = 0

– Using Master Theorem, with a = 2, b = 2, c = 0, d = 1:

∗ Cw(n) ∈ Θ(nlogn)

∗ For n = 2k, exact solution isCw(n) = nlogn− n+ 1

5

Page 6: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Quick Sort

• The general approach:

1. Given array A[0..n− 1], sort by partitioning into 2 sets:

(a) Those elements greater than arbitrary element A[s]

(b) Those less than A[s]

2. Putting A[s] in in its proper position yieldsA[0]...A[s− 1]A[s]A[s+ 1]...A[n− 1]

3. Then recursively sort each of the 2 partitions

• Algorithm:

Alg QS(A[l .. r]) {

//Input: Array A[l..r] where l and r are indices of upper and lower bounds of a partition

if (l < r) {

s <- partition(A[l..r])

QS(A[l..s - 1])

QS(A[s + 1..r])

}

}

• Partitioning

– Need to select element about which to create the partition

∗ Called the pivot

∗ Select element A[l]

– To create the partition:

1. Scan left-to-right and right-to-left from each end

2. Stop when find element in lower half greater than pivot, and when findelement in upper half less than pivot

3. 3 situations can arise:

(a) Pointers not crossed: swap elements and continue

(b) Pointers have crossed: swap pivot with A[j] and stop

(c) Pointers at same point (value equal to pivot): do nothing and stop

6

Page 7: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Quick Sort (2)

– Algorithm:Partition (A[l..r]) {

//Input: Array A[l..r] where l and r are indices of upper and lower bounds of a partition

p <- A[l]

i <- l

j <- r + 1

repeat

repeat

i++

until (A[i] >= p)

repeat

j--

until (A[j] <= p)

swap (A[i], A[j])

until (i >= j)

swap (A[i], A[j]) //undoes last swap

swap (A[l], A[j])

return j

}

• Analysis:

1. Best case:

– Partitions are equal with n/2 elements

– Number of comparisons (in partitioning) = n + 1 (pointers cross), notherwise

– Cb(n) = 2Cb(n/2) + n, n > 1

– Cb(1) = 0

– By the Master Theorem, Cb(n) ∈ Θ(nlog2n)

2. Worst case:

– One partition always empty:

∗ Perform n+ 1 comparisons

∗ Swap A[0] with itself

∗ Recursively call QS with partitions of 0 and n− 1 elements

– Cw(n) = (n+ 1) + n+ ...+ 3 = (n+1)(n+2)2 − 3 ∈ Θ(n2)

3. Average case:

– Assume split can occur at any position with same probability 0 ≥ s ≥n− 1

– Ca(n) = 1n

∑n−1s=0 [(n+ 1) + Cavg(s) + Cavg(n− 1− s)], n > 1

– Ca(0) = Cavg(1) = 0

– Ca(n) ∼ 2n ln n ∼ 1.38n log2 n

7

Page 8: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Binary Search

• Algorithm:

Alg BS (A[0..n - 1], k) {

//Input: Array A and key k

l <- 0

r <- n - 1

while (l <= r) {

m <- floor((l 1 r) / 2)

if (k = A[m])

return m

else if (k < A[m])

r <- m - 1

else

l <- m + 1

}

return -1

}

• Analysis:

– Basic operation is comparison

– Assume relation between k and A[m] determined by 1 comparison

– Worst case:

∗ k 6∈ A or k is last element checked

∗ Cw(n) = Cw(bn/2c) + 1, n > 1

∗ Cw(1) = 1

∗ Assuming n = 2k,Cw(2k) = k + 1 = log2n+ 1

∗ For arbitrary n, Cw(n) = blog2n]c+ 1 = dlog2(n+ 1)e· Proof:

1. Let n = 2i, i > 0Cw(n) = blog2nc+ 1

= blog22ic+ 1= blog22 + log2ic+ 1= 1 + blog2ic+ 1= blog2ic+ 2

2. From earlier:Cw(bn/2c) + 1 = Cw(b2i/2c) + 1

= Cw(i) + 1= (blog2ic+ 1) + 1= blog2ic+ 2

8

Page 9: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Binary Search (2)

– Avg case:

∗ Ca(n) ∼ log2n

– Binary search is optimal for searches based on key comparisons

9

Page 10: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Binary Tree Height

• Height of binary tree is length of longest path from root to leaf

• Height of empty tree = -1

• Algorithm:

Alg BTH ($T$) {

//Input: Binary tree T

//Output: Height of T

if (T == NULL)

return -1

else

return max(BTH(T_left), BTH(T_right)) + 1

}

• Analysis:

– Size: n(T ) - number of nodes in tree

– Basic op: addition or comparison

– A(n(T )) = A(n(TL)) + A(n(TR)) + 1, n(T ) > 0; A(0) = 0

– Analysis based on extended tree

∗ Represent empty subtree with special node called external node

∗ Extended tree consists of internal (regular) nodes and external nodes

– Algorithm executes one addition for each internal node

∗ Since n internal nodes, A(n) = n

– Most frequently executed operation is comparison (if (T == φ))

∗ Execute 1 comparison for each internal node and for each external node

10

Page 11: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Binary Tree Height (2)

– How many external nodes for tree with n internal nodes?

∗ By examination, appears x = n+ 1

∗ Proof:

1. Base case: When n = 0, x = 1; trivial

2. Hypothesis: For tree with k internal nodes,x = k + 1

3. Inductive part:

(a) Consider tree T with k + 1 internal nodes

(b) T consists of a root r, and left and right subtrees TL and TR(c) Let kL and kR be internal nodes of TL and TR(d) kL + kR = k

(e) kL ≤ k; kR ≤ k

(f) By hypothesis, xL = kL + 1 and xR = kR + 1

(g) Number of external nodes in tree must bex = xL + xL = kL + 1 + kR + 1 = (k + 1) + 1

4. Given n+ 1 external nodes in tree with n internal nodes,comparisons = n+ (n+ 1) = 2n+ 1

11

Page 12: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Integer Multiplication

• Consider multiplication of 2 2-digit decimal numbers, c = a ∗ b:

a1a0X b1b0

a1b0 a0b0a1b1 a0b1

a1b1 (a1b0 + a0b1) a0b0

• Expressing as sums of powers of 10:a∗b = c = (a1101+a0100)∗(b1101+b0100) = a1b1102+(a1b0+a0b1)101+a0b0100

• Requires 4 multiplications and 3 additions (excluding multiplication of 10i)

• If n = number of digits, multiplication ∈ Θ(n2)

• Let c0 = a0b0, c2 = a1b1

• Note that a1b0 + a0b1 = (a1 + a0)(b1 + b0)− (a0b0 + a1b1) = c1

• Then c = c2102 + c1101 + c0100

– Since a0b0 and a1b1 are already calculated as c2 and c0, this requires 1 lessmultiplication, at the expense of more addition/subtraction

– Requires 3 multiplications and 6 additions/subtractions

• Recursive approach using this algorithm:

– Let n = 2k

– a = a110n/2 + a0, b = b110n/2 + b0 where a1 represents n/2 leftmost digitsof a, a0 rightmost n/2 digits

– Then c = a ∗ b = c210n + c110n/2 + c0 where ci as defined above

12

Page 13: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Integer Multiplication (2)

• Example

1234× 3215 = (12× 102 + 34)× (32× 102 + 15) =

(12× 32)104 + [(12 + 34)× (32 + 15)− (12× 32 + 34× 15)]102 + 34× 15 =(12× 32)104 + [(46× 47)− (12× 32 + 34× 15)]102 + 34× 15 =384× 104 + (2162− (384 + 510))102 + 510 =3840000 + 126800 + 510 =3967310

12× 32 = (1× 10 + 2)× (3× 10 + 2) =(1× 3)102 + [(1 + 2)× (3 + 2)− (1× 3 + 2× 2)]10 + (2× 2) =300 + (3× 5− 7)10 + 4 =300 + 80 + 4 =384

34× 15 = (3× 10 + 4)× (1× 10 + 5) =(3× 1)102 + [(3 + 4)× (1 + 5)− (3× 1 + 4× 5)]10 + 4× 5 =300 + (7× 6− 23)10 + 20 =300 + 190 + 20 =510

46× 47 = (4× 10 + 6)× 4× 10 + 7) =(4× 4)102 + [(4 + 6)× (4 + 7)− (4× 4 + 6× 7)]10 + 6× 7 =1600 + (10× 11− 58)10 + 42 =1600 + 520 + 42 =2162

13

Page 14: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Integer Multiplication (3)

• Analysis

– M(n) = 3M(n/2), n > 1; M(1) = 1

– Using backwards substitution, M(2k) = 3iM(2k−i) = 3kM(2k−k) = 3k

– k = log2n, so M(n) = 3log2n = nlog23

– Since log23 < 2, this is more efficient than brute force approach

14

Page 15: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Strassen’s Algorithm

• Consider 2 n× n matrices A and B

• Let C = A ∗B

• Then Cij = Σn−1j=0aij ∗ bji

• For a 2× 2 matrix, this requires 8 multiplications and 4 additions

• Strassen’s algorithm improves on this

• c00 c01c10 c11

=

a00 a01a10 a11

∗ b00 b01b10 b11

=

m1 +m4 −m5 +m7 m3 +m5

m2 +m4 m1 +m3 −m2 +m6

• where

1. m1 = (a00 + a11) ∗ (b00 + b11)

2. m2 = (a10 + a11) ∗ b003. m3 = a00 ∗ (b01 − b11)4. m4 = a11 ∗ (b10 − b00)5. m5 = (a00 + a01) ∗ b116. m6 = (a10 − a00) ∗ (b00 + b01)

7. m7 = (a01 − a11) ∗ (b10 + b11)

• This requires only 7 multiplications, but 18 additions/subtractions

• For large arrays, can apply this approach recursively:

– Let n = 2k

– Divide each matrix into 4 submatrices, each of size n2 ×

n2

– Then

C00 C01

C10 C11

=

A00 A01

A10 A11

∗ B00 B01

B10 B11

– Each Cij can be computed usingMi, whereMi correspond tomi in Strassen’s

algorithm, as applied to the submatrices

15

Page 16: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Strassen’s Algorithm (2)

• Analysis:

– M(n) = 7M(n/2), n > 1; M(1) = 1

– Let n = 2k

– Using backwards substitution,M(n) = ...

= 7iM(2k−i)= ...= 7kM(2k−k)= 7k = 7log2n

= nlog27 < n3

– Must also consider additions

– A(n) = 7A(n/2) + 18(n/2)2, n > 1;A(1) = 0

– By the Master Theorem (a = 7, b = 2, d = 2),A(n) ∈ Θ(nlog27)

– This is same growth as M(n)

16

Page 17: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Strassen’s Algorithm (3)

• Example:

C = A×B =

3 1 0 42 1 1 21 3 1 21 1 1 1

×

2 2 2 20 1 2 12 1 0 31 3 3 1

A00 =

3 12 1

A01 =

0 41 2

A10 =

1 31 1

A11 =

1 21 1

B00 =

2 21 0

B01 =

2 22 1

B10 =

2 11 3

B11 =

0 33 1

M1 =

4 33 2

× 2 5

3 2

=

17 2612 19

M2 =

2 52 2

× 2 2

0 1

=

4 94 6

M3 =

3 12 1

× 2 −1−1 0

=

5 −33 −2

M4 =

1 21 1

× 0 −1

1 2

=

2 31 1

M5 =

3 53 3

× 0 3

3 1

=

15 149 12

M6 =

−2 2−1 0

× 4 4

2 2

=

−4 −4−4 −4

M7 =

−1 20 1

× 2 4

4 4

=

6 44 4

17

Page 18: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Strassen’s Algorithm (4)

C00 = M1 +M4 −M5 +M7 =

10 198 12

C01 = M3 +M5 =

20 1112 10

C10 = M2 +M4 =

6 125 7

C11 = M1 +M3 −M2 +M6 =

14 107 7

C =

C00 C01

C10 C11

=

10 19 20 118 12 12 106 12 14 105 7 7 7

18

Page 19: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Closest Pair Problem

• Basic approach:

1. Divide points into 2 equal sets

2. Find closest pair in each set

3. Return smaller of 2 values

• Assume points are ordered by x coordinate

• Let c be median x value

• All points to left of x = c lie in subset S1, those to right in S2

• Let d1 be smallest distance in S1, d2 in S2

• Then d = min(d1, d2) is distance between closest pair in S1 and S2, but notnecessarily the closest distance between 2 points in S

• 2 points, one on each side of x = c, could be closer than d

• To determine if there is such a pair, we need only consider points whose x

coordinate is within c± d

19

Page 20: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Closest Pair Problem (2)

• For such a point P (x, y) (assume on side S1), there can be at most 6 points inS2 within y ± d of P

• Therefore, to combine solutions to recursive call:

1. Find d = min(d1, d2)

2. Find points in S1 and S2 within c± d3. Based on y coordinates, walk through one set (say those from S1), finding

distance between each point and those within y ± d from S2

4. Compare each distance to d

• Analysis:

– T (n) = 2(T (n/2) + C(n)

– C(n) is cost to combine results

– C(n) ∈ Θ(n)

– By the Master Theorem (a = 2, b = 2, d = 1),T (n) ∈ O(nlogn)

20

Page 21: Divide and Conquer: Intro Principle behind D&Cdjmoon/algs-ug/algs-notes/algs-c5.pdf · Divide and Conquer: Intro Principle behind D&C: 1. Divide problem into several smaller instances

Divide and Conquer: Convex Hull (Quick Hull)

• Given n points, ordered by x coordinate (and secondarily by y)

• Leftmost and rightmost points (P1 and Pn) lie on convex hull

• Let line P1Pn divide points into sets S1 (to left/above) and S2 (to right/below)

• To recursively generate convex hull (applied to both S1 and S2):

– Find point Pmax farthest from P1Pn

– If a tie, choose point that maximizes 6 P1PmaxPn

– Partition S1 into 3 sets:

1. S11: points to left of P1Pmax

2. S12: points to left of PmaxP1

3. Those points in ∆P1PmaxPn

– Pmax lies on convex hull

– Recurse on S11 and S12

• To determine whether a point lies to left or right of a line:

– Calculate

∣∣∣∣∣∣∣∣∣x1 y1 1x2 y2 1x3 y3 1

∣∣∣∣∣∣∣∣∣– P3 is to left of P1P2 if determinant = 0

• Worst case ∈ Θ(n2)

• best case ∈ Θ(nlogn)

21