divide and conquer: intro principle behind d&cdjmoon/algs-ug/algs-notes/algs-c5.pdf · divide...
TRANSCRIPT
Divide and Conquer: Intro
• Principle behind D&C:
1. Divide problem into several smaller instances of same problem
2. Solve smaller instances
3. Combine solutions
• Most common number of subproblems is 2
• D&C not guaranteed to be more efficient than Brute Force
• In general:
– Problem of size n divided into a subproblems
– Each subproblem of size n/b (a < n/b)
– If n = kb, then T (n) = aT (n/b) + f(n)
– f(n) represents work required for subdivision and recombination
1
Divide and Conquer: Master Theorem
• The Master Theorem:
Let t(n) be eventually non-decreasing and satisfyt(n) = at(n/b) + f(n), n = bk, k = 1, 2, ...t(1) = c, where a ≥ 1, b ≥ 2, c > 0
If f(n) ∈ Θ(nd) where d ≥ 0 then,
t(n) ∈
Θ(nd) if a < bd
Θ(ndlogn) if a = bd
Θ(nlogba) if a > bd
• Proof (see pp 428-429)
2
Divide and Conquer: Intro Ex
• Problem: Add n numbers a0, a1, ..., an−1
• Letn−1∑i=0
ai =bn/2c−1∑
i=0
ai +n−1∑
i=bn/2cai
• Analysis:
– Basic Op = addition
– A(n) = 2A(n/2) + 1
– Using the Master Theorem,
∗ a = 2, b = 2, f(n) = 1, d = 0
∗ a > bd
∗ A(n) ∈ Θ(nlogba) = Θ(n)
3
Divide and Conquer: Merge Sort
• Problem: Given array A[0..n− 1], sort
• General approach:
1. Divide into A[0..bn/2c − 1], A[bn/2c..n− 1]
2. Sort each subarray
3. Merge results
• Algorithm:
Alg MS(A[0 .. n - 1]) {
if (n > 1) {
copy A[0 .. floor(n/2) - 1 ] to B[0 .. floor(n/2) - 1 ]
copy A[floor(n/2) .. n - 1] to C[0 .. ceiling(n/2) - 1 ]
MS(B[0..floor(n/2) - 1])
MS(C[0..ceiling(n/2) - 1])
merge(B, C, A)
}
}
• Merging algorithm:
Alg Merge(B[0 .. p - 1], C[0 .. q - 1], A[0 .. p + q - 1]) {
i <- 0
j <- 0
k <- 0
while (i < p and j < q) {
if (B[i] <= C[j]) {
A[k] <- B[i]
i++
}
else {
A[k] <- C[j]
j++
}
k++
}
if (i = p)
copy C[j .. q - 1] to A[k .. p + q - 1]
else
copy B[i .. p - 1] to A[k .. p + q - 1]
}
4
Divide and Conquer: Merge Sort (2)
• Merge analysis:
– Basic operation: comparison
– 1 comparison per iteration
– Each iteration reduces number of elements to be processed by 1
– Worst case when one array empty, other has single element:Cmerge,w(n) = n− 1
• Merge sort analysis:
– C(n) = 2C(n/2) + Cmerge(n), n > 1
– C(1) = 0
– Cw(n) = 2Cw(n/2) + n− 1, n > 1
– Cw(1) = 0
– Using Master Theorem, with a = 2, b = 2, c = 0, d = 1:
∗ Cw(n) ∈ Θ(nlogn)
∗ For n = 2k, exact solution isCw(n) = nlogn− n+ 1
5
Divide and Conquer: Quick Sort
• The general approach:
1. Given array A[0..n− 1], sort by partitioning into 2 sets:
(a) Those elements greater than arbitrary element A[s]
(b) Those less than A[s]
2. Putting A[s] in in its proper position yieldsA[0]...A[s− 1]A[s]A[s+ 1]...A[n− 1]
3. Then recursively sort each of the 2 partitions
• Algorithm:
Alg QS(A[l .. r]) {
//Input: Array A[l..r] where l and r are indices of upper and lower bounds of a partition
if (l < r) {
s <- partition(A[l..r])
QS(A[l..s - 1])
QS(A[s + 1..r])
}
}
• Partitioning
– Need to select element about which to create the partition
∗ Called the pivot
∗ Select element A[l]
– To create the partition:
1. Scan left-to-right and right-to-left from each end
2. Stop when find element in lower half greater than pivot, and when findelement in upper half less than pivot
3. 3 situations can arise:
(a) Pointers not crossed: swap elements and continue
(b) Pointers have crossed: swap pivot with A[j] and stop
(c) Pointers at same point (value equal to pivot): do nothing and stop
6
Divide and Conquer: Quick Sort (2)
– Algorithm:Partition (A[l..r]) {
//Input: Array A[l..r] where l and r are indices of upper and lower bounds of a partition
p <- A[l]
i <- l
j <- r + 1
repeat
repeat
i++
until (A[i] >= p)
repeat
j--
until (A[j] <= p)
swap (A[i], A[j])
until (i >= j)
swap (A[i], A[j]) //undoes last swap
swap (A[l], A[j])
return j
}
• Analysis:
1. Best case:
– Partitions are equal with n/2 elements
– Number of comparisons (in partitioning) = n + 1 (pointers cross), notherwise
– Cb(n) = 2Cb(n/2) + n, n > 1
– Cb(1) = 0
– By the Master Theorem, Cb(n) ∈ Θ(nlog2n)
2. Worst case:
– One partition always empty:
∗ Perform n+ 1 comparisons
∗ Swap A[0] with itself
∗ Recursively call QS with partitions of 0 and n− 1 elements
– Cw(n) = (n+ 1) + n+ ...+ 3 = (n+1)(n+2)2 − 3 ∈ Θ(n2)
3. Average case:
– Assume split can occur at any position with same probability 0 ≥ s ≥n− 1
– Ca(n) = 1n
∑n−1s=0 [(n+ 1) + Cavg(s) + Cavg(n− 1− s)], n > 1
– Ca(0) = Cavg(1) = 0
– Ca(n) ∼ 2n ln n ∼ 1.38n log2 n
7
Divide and Conquer: Binary Search
• Algorithm:
Alg BS (A[0..n - 1], k) {
//Input: Array A and key k
l <- 0
r <- n - 1
while (l <= r) {
m <- floor((l 1 r) / 2)
if (k = A[m])
return m
else if (k < A[m])
r <- m - 1
else
l <- m + 1
}
return -1
}
• Analysis:
– Basic operation is comparison
– Assume relation between k and A[m] determined by 1 comparison
– Worst case:
∗ k 6∈ A or k is last element checked
∗ Cw(n) = Cw(bn/2c) + 1, n > 1
∗ Cw(1) = 1
∗ Assuming n = 2k,Cw(2k) = k + 1 = log2n+ 1
∗ For arbitrary n, Cw(n) = blog2n]c+ 1 = dlog2(n+ 1)e· Proof:
1. Let n = 2i, i > 0Cw(n) = blog2nc+ 1
= blog22ic+ 1= blog22 + log2ic+ 1= 1 + blog2ic+ 1= blog2ic+ 2
2. From earlier:Cw(bn/2c) + 1 = Cw(b2i/2c) + 1
= Cw(i) + 1= (blog2ic+ 1) + 1= blog2ic+ 2
8
Divide and Conquer: Binary Search (2)
– Avg case:
∗ Ca(n) ∼ log2n
– Binary search is optimal for searches based on key comparisons
9
Divide and Conquer: Binary Tree Height
• Height of binary tree is length of longest path from root to leaf
• Height of empty tree = -1
• Algorithm:
Alg BTH ($T$) {
//Input: Binary tree T
//Output: Height of T
if (T == NULL)
return -1
else
return max(BTH(T_left), BTH(T_right)) + 1
}
• Analysis:
– Size: n(T ) - number of nodes in tree
– Basic op: addition or comparison
– A(n(T )) = A(n(TL)) + A(n(TR)) + 1, n(T ) > 0; A(0) = 0
– Analysis based on extended tree
∗ Represent empty subtree with special node called external node
∗ Extended tree consists of internal (regular) nodes and external nodes
– Algorithm executes one addition for each internal node
∗ Since n internal nodes, A(n) = n
– Most frequently executed operation is comparison (if (T == φ))
∗ Execute 1 comparison for each internal node and for each external node
10
Divide and Conquer: Binary Tree Height (2)
– How many external nodes for tree with n internal nodes?
∗ By examination, appears x = n+ 1
∗ Proof:
1. Base case: When n = 0, x = 1; trivial
2. Hypothesis: For tree with k internal nodes,x = k + 1
3. Inductive part:
(a) Consider tree T with k + 1 internal nodes
(b) T consists of a root r, and left and right subtrees TL and TR(c) Let kL and kR be internal nodes of TL and TR(d) kL + kR = k
(e) kL ≤ k; kR ≤ k
(f) By hypothesis, xL = kL + 1 and xR = kR + 1
(g) Number of external nodes in tree must bex = xL + xL = kL + 1 + kR + 1 = (k + 1) + 1
4. Given n+ 1 external nodes in tree with n internal nodes,comparisons = n+ (n+ 1) = 2n+ 1
11
Divide and Conquer: Integer Multiplication
• Consider multiplication of 2 2-digit decimal numbers, c = a ∗ b:
a1a0X b1b0
a1b0 a0b0a1b1 a0b1
a1b1 (a1b0 + a0b1) a0b0
• Expressing as sums of powers of 10:a∗b = c = (a1101+a0100)∗(b1101+b0100) = a1b1102+(a1b0+a0b1)101+a0b0100
• Requires 4 multiplications and 3 additions (excluding multiplication of 10i)
• If n = number of digits, multiplication ∈ Θ(n2)
• Let c0 = a0b0, c2 = a1b1
• Note that a1b0 + a0b1 = (a1 + a0)(b1 + b0)− (a0b0 + a1b1) = c1
• Then c = c2102 + c1101 + c0100
– Since a0b0 and a1b1 are already calculated as c2 and c0, this requires 1 lessmultiplication, at the expense of more addition/subtraction
– Requires 3 multiplications and 6 additions/subtractions
• Recursive approach using this algorithm:
– Let n = 2k
– a = a110n/2 + a0, b = b110n/2 + b0 where a1 represents n/2 leftmost digitsof a, a0 rightmost n/2 digits
– Then c = a ∗ b = c210n + c110n/2 + c0 where ci as defined above
12
Divide and Conquer: Integer Multiplication (2)
• Example
1234× 3215 = (12× 102 + 34)× (32× 102 + 15) =
(12× 32)104 + [(12 + 34)× (32 + 15)− (12× 32 + 34× 15)]102 + 34× 15 =(12× 32)104 + [(46× 47)− (12× 32 + 34× 15)]102 + 34× 15 =384× 104 + (2162− (384 + 510))102 + 510 =3840000 + 126800 + 510 =3967310
12× 32 = (1× 10 + 2)× (3× 10 + 2) =(1× 3)102 + [(1 + 2)× (3 + 2)− (1× 3 + 2× 2)]10 + (2× 2) =300 + (3× 5− 7)10 + 4 =300 + 80 + 4 =384
34× 15 = (3× 10 + 4)× (1× 10 + 5) =(3× 1)102 + [(3 + 4)× (1 + 5)− (3× 1 + 4× 5)]10 + 4× 5 =300 + (7× 6− 23)10 + 20 =300 + 190 + 20 =510
46× 47 = (4× 10 + 6)× 4× 10 + 7) =(4× 4)102 + [(4 + 6)× (4 + 7)− (4× 4 + 6× 7)]10 + 6× 7 =1600 + (10× 11− 58)10 + 42 =1600 + 520 + 42 =2162
13
Divide and Conquer: Integer Multiplication (3)
• Analysis
– M(n) = 3M(n/2), n > 1; M(1) = 1
– Using backwards substitution, M(2k) = 3iM(2k−i) = 3kM(2k−k) = 3k
– k = log2n, so M(n) = 3log2n = nlog23
– Since log23 < 2, this is more efficient than brute force approach
14
Divide and Conquer: Strassen’s Algorithm
• Consider 2 n× n matrices A and B
• Let C = A ∗B
• Then Cij = Σn−1j=0aij ∗ bji
• For a 2× 2 matrix, this requires 8 multiplications and 4 additions
• Strassen’s algorithm improves on this
• c00 c01c10 c11
=
a00 a01a10 a11
∗ b00 b01b10 b11
=
m1 +m4 −m5 +m7 m3 +m5
m2 +m4 m1 +m3 −m2 +m6
• where
1. m1 = (a00 + a11) ∗ (b00 + b11)
2. m2 = (a10 + a11) ∗ b003. m3 = a00 ∗ (b01 − b11)4. m4 = a11 ∗ (b10 − b00)5. m5 = (a00 + a01) ∗ b116. m6 = (a10 − a00) ∗ (b00 + b01)
7. m7 = (a01 − a11) ∗ (b10 + b11)
• This requires only 7 multiplications, but 18 additions/subtractions
• For large arrays, can apply this approach recursively:
– Let n = 2k
– Divide each matrix into 4 submatrices, each of size n2 ×
n2
– Then
C00 C01
C10 C11
=
A00 A01
A10 A11
∗ B00 B01
B10 B11
– Each Cij can be computed usingMi, whereMi correspond tomi in Strassen’s
algorithm, as applied to the submatrices
15
Divide and Conquer: Strassen’s Algorithm (2)
• Analysis:
– M(n) = 7M(n/2), n > 1; M(1) = 1
– Let n = 2k
– Using backwards substitution,M(n) = ...
= 7iM(2k−i)= ...= 7kM(2k−k)= 7k = 7log2n
= nlog27 < n3
– Must also consider additions
– A(n) = 7A(n/2) + 18(n/2)2, n > 1;A(1) = 0
– By the Master Theorem (a = 7, b = 2, d = 2),A(n) ∈ Θ(nlog27)
– This is same growth as M(n)
16
Divide and Conquer: Strassen’s Algorithm (3)
• Example:
C = A×B =
3 1 0 42 1 1 21 3 1 21 1 1 1
×
2 2 2 20 1 2 12 1 0 31 3 3 1
A00 =
3 12 1
A01 =
0 41 2
A10 =
1 31 1
A11 =
1 21 1
B00 =
2 21 0
B01 =
2 22 1
B10 =
2 11 3
B11 =
0 33 1
M1 =
4 33 2
× 2 5
3 2
=
17 2612 19
M2 =
2 52 2
× 2 2
0 1
=
4 94 6
M3 =
3 12 1
× 2 −1−1 0
=
5 −33 −2
M4 =
1 21 1
× 0 −1
1 2
=
2 31 1
M5 =
3 53 3
× 0 3
3 1
=
15 149 12
M6 =
−2 2−1 0
× 4 4
2 2
=
−4 −4−4 −4
M7 =
−1 20 1
× 2 4
4 4
=
6 44 4
17
Divide and Conquer: Strassen’s Algorithm (4)
C00 = M1 +M4 −M5 +M7 =
10 198 12
C01 = M3 +M5 =
20 1112 10
C10 = M2 +M4 =
6 125 7
C11 = M1 +M3 −M2 +M6 =
14 107 7
C =
C00 C01
C10 C11
=
10 19 20 118 12 12 106 12 14 105 7 7 7
18
Divide and Conquer: Closest Pair Problem
• Basic approach:
1. Divide points into 2 equal sets
2. Find closest pair in each set
3. Return smaller of 2 values
• Assume points are ordered by x coordinate
• Let c be median x value
• All points to left of x = c lie in subset S1, those to right in S2
• Let d1 be smallest distance in S1, d2 in S2
• Then d = min(d1, d2) is distance between closest pair in S1 and S2, but notnecessarily the closest distance between 2 points in S
• 2 points, one on each side of x = c, could be closer than d
• To determine if there is such a pair, we need only consider points whose x
coordinate is within c± d
19
Divide and Conquer: Closest Pair Problem (2)
• For such a point P (x, y) (assume on side S1), there can be at most 6 points inS2 within y ± d of P
• Therefore, to combine solutions to recursive call:
1. Find d = min(d1, d2)
2. Find points in S1 and S2 within c± d3. Based on y coordinates, walk through one set (say those from S1), finding
distance between each point and those within y ± d from S2
4. Compare each distance to d
• Analysis:
– T (n) = 2(T (n/2) + C(n)
– C(n) is cost to combine results
– C(n) ∈ Θ(n)
– By the Master Theorem (a = 2, b = 2, d = 1),T (n) ∈ O(nlogn)
20
Divide and Conquer: Convex Hull (Quick Hull)
• Given n points, ordered by x coordinate (and secondarily by y)
• Leftmost and rightmost points (P1 and Pn) lie on convex hull
• Let line P1Pn divide points into sets S1 (to left/above) and S2 (to right/below)
• To recursively generate convex hull (applied to both S1 and S2):
– Find point Pmax farthest from P1Pn
– If a tie, choose point that maximizes 6 P1PmaxPn
– Partition S1 into 3 sets:
1. S11: points to left of P1Pmax
2. S12: points to left of PmaxP1
3. Those points in ∆P1PmaxPn
– Pmax lies on convex hull
– Recurse on S11 and S12
• To determine whether a point lies to left or right of a line:
– Calculate
∣∣∣∣∣∣∣∣∣x1 y1 1x2 y2 1x3 y3 1
∣∣∣∣∣∣∣∣∣– P3 is to left of P1P2 if determinant = 0
• Worst case ∈ Θ(n2)
• best case ∈ Θ(nlogn)
21