introduction to algorithms lecture 3 – divide and conquer

Introduction to Algorithms

Lecture 3 – Divide and Conquer

The Methodology• Divide-and-Conquer is a useful paradigm.• Main idea– Divide the problem into k > 2 sub-problems.– Solve the sub-problems.– Combine the solutions of the sub-problems to

arrive at a solution to the original problem.

The Methodology• Divide-and-Conquer is a useful paradigm.• Main idea– Divide the problem into k > 2 sub-problems.– Solve the sub-problems, usually recursively.– Combine the solutions of the sub-problems to

arrive at a solution to the original problem.

The Methodology

• When using recursion to solve the subproblems, the analysis for running time is also natural.

• Let the size of the original problem be n and T(n) be the time

• D(n) : Time taken to divide the problem. Let there be k subproblems of sizes n1, n2, …,nk.

• To solve the ith subproblem, the time taken would be T(ni).

• C(n): Time taken to combine the solutions.

)()()()(1

nCnDnTnTk

ii

Examples• Merge Sort– Division is easy. D(n) = O(1). We get two

subproblems of size n/2 each.– Each subproblems take T(n/2) time.– Combine step takes O(n) time.– Recurrence relation is T(n) = 2T(n/2) + O(n)– Solution using Master’s theorem is T(n) = O(nlog n).

Examples• Quick sort (assuming a ``good’’ pivot)– Divide : Uses the partition algorithm. Take time

O(n).– Solving the subproblems is now 2T(n/2)– Combine: Nothing to do here.., so O(1).– Recurrence relation: T(n) = 2T(n/2) + O(n).– Solution using recurrence relation again is T(n) =

O(nlog n)

Examples• The selection algorithm from previous lecture.• The subproblem sizes are not equal.• No combination step, but an O(n) time to

divide.

A Complete Example• Let us consider the multiplication of two

square matrices A and B.• Suppose we know just to multiply scalar

values.• And we know the scheme of matrix

multiplication.

Matrix Multiplication• Can use divide and conquer as follows.• Divide the matrices into 4 parts.

X =

A B C

A00

A10

A01

A11

B00

B10

B01

B11

C00

C10

C01

C11

C00 = A00 x B00 + A10 x B01 C00 = A00 x B00 + A10 x B01

C00 = A00 x B00 + A10 x B01 C00 = A00 x B00 + A10 x B01

Matrix Multiplication• The recurrence relation can be obtained as

follows.• Divide time is O(1).• There are four subproblems. And each

subproblem is of size n/4.• Combine time is O(n2).• The recurrence relation is T(n) = 8T(n/2) + O(n2)• The solution is T(n) = O(n3).

A Better Algorithm• Surprisingly, you can do better than O(n3).• Recently, there is an O(n2.23) algorithm.• In fact, the lower bound is still open.• We will get somewhere close to that.

A Slight Detour - Motivation• Consider two complex numbers a+ib, and c+id

that we need to multiply.• The result is (ac-bd) + i(ad+bc).– Requires four multiplications and two additions.

• Suppose that multiplications are expensive compared to additions.

• Can we save on one multiplication?– Can use more additions or subtractions.

Motivation• Consider the products P1 = ac, P2 = bd, and P3

= (a+b)(c+d).• Now, the result of multiplication is (P1-P2) +

i(P3-P2-P1).• We have 3 multiplications and 5 additions/

subtractions.• A similar idea can be used for matrix

multiplication too.

Motivations• Question: Verify quickly with two complex

numbers of your choice.

Motivation• Is there a practical reason to use the new

approach even if multiplication is as expensive as an addition/subtraction?

A Better Algorithm• Notice that matrix addition is easier than

matrix multiplication.• So, we can trade-off some matrix

multiplications for matrix additions, and subtractions.

A Better Algorithm• Let A11, A12, A21, and A22 be the four

submatrices of A, each of size n/2xn/2.• Let B11, B12, B21, and B22 be the four

submatrices of B, each of size n/2xn/2.• We need,

C11 = A11 x B11 + A12 x B21 C12 = A11 x B12 + A12 x B22 C21 = A21 x B11 + A22 x B21 C22 = A21 x B12 + A22 x B22

A Better Algorithm• Strassen computes each of the eight AixBj

using linera combinations of the following seven matrix products.

• P1 = A11(B12 − B22)• P2 = (A11 + A12)B22• P3 = (A21 + A22)B11• P4 = A22(B21 − B11)• P5 = (A11 + A22)(B11 + B22)• P6 = (A12 − A22)(B21 + B22)• P7 = (A11 − A21)(B11 + B12)

Strassen’s Algorithm• Question: Verify that • C11 =P5 + P4 − P2 + P6• C12 = P1 + P2• C21 =P3 + P4• C22 =P1 + P5 − P3 − P7

A Better Algorithm• Asymptotically, Strassen’s algorithm works

better than the O(n3) standard algorithm.• In reality, notice that one has to perform 14

matrix additions and seven matrix products.• So, in practice, the method may be slow due

to the large constants involved in the O notation.

A Better Algorithm• Strassen saves one multiplication by introducing

more additions.• Thus, the recurrence relation would be T(n) = 7T(n/2) + O(n2).• Question: Solve for T(n) using Master’s theorem.

A Better Algorithm• Strassen saves one multiplication by introducing

more additions.• Thus, the recurrence relation would be T(n) = 7T(n/2) + O(n2).• Using Master’s theorem, the solution now is

T(n) = O(nlog2 7).

• Notice that log2 7 < 3. Hence, the algorithm beats the n3 time for the earlier algorithm.

Yet Another Example• We will consider sorting again. • But this time as a network/hardware for

sorting.• There are some nice principles along the way.

Comparators• A comparator is a hardware element that

takes two inputs and produces two outputs.

• We assume that a comparator operates in O(1) time.

• Can connect comparators to form a network.

Comparatorx

y

Min(x,y)

Max(x,y)

A Network

Sorting Network• A comparator network that takes n inputs and

produces a sorted rearrangement of the inputs.

Sorting Network• Size of the network : The number of

comparators used.

• Depth of a network is defined as follows.• Let the depth of a input wire be 0.• The depth of a comparator with inputs as

depth dx and dy will be max(dx, dy) + 1.• The depth of a network is the maximum depth

of a comparator in the network.

How to Build a Sorting Network• Bitonic Sequence: A sequence that monotonically

increases and then decreases monotonically.– More formally, has two indices i and j such that A(i: j-

1) is monotonically increasing, and A(j:i-1) is monotonically decreasing. • Corresponds to a shifting of A so as to get the intuitive

definition.

• Example: 11, 16, 20, 23, 34, 17, 9, 5, 1• We first construct a sorting network that can sort

input sequences that are bitonic.• We then show how to use such a network to sort

arbitrary sequences.

Sorting Network – Bitonic Sequences• Suppose that a sequence X is bitonic.• Let L(X) = {min{xi, xn/2+i}, 0 ≤ i ≤ n/2-1}.

• Let R(X) = {max{xi, xn/2+i}, 0 ≤ i ≤ n/2-1}. • Then both L(X) and R(X) are bitonic, and every

element of L(X) has a value at most the value of any element of R(X).

How to Use the Above Theorem• What the above theorem says is that a bitonic

sequence can be “divided” into two bitonic sequences of equal length, with– One sequence containing elements that are all

smaller than the other.• Can be applied recursively to divide sequences

further into smaller subsequences.• Eventually, sorts the input bitonic sequence.

In Pictures

• We have that any of min(a1,a5) to min(a4,a8) is SMALLER than any of the other four.

Put Them Together..

Bitonic Sorter• Has a depth of O(log n)• Has O(nlog n) comparators.

• But does not work for arbitrary sequences.• Also, have used a property that is yet to be

proved.

Sorting Network – Bitonic Sequences• Suppose that a sequence X is bitonic.• Let L(X) = {min{xi, xn/2+i}, 0 ≤ i ≤ n/2-1}.

• Let R(X) = {max{xi, xn/2+i}, 0 ≤ i ≤ n/2-1}. • Then both L(X) and R(X) are bitonic, and every

element of L(X) has a value at most the value of any element of R(X).

• Can be shown by the unique crossover property.

Unique Crossover Property

Unique Crossover Property• Let X be a bitonic sequence. The crossover property

states that there exists an index i such that:1. for any a in {x0, x1, …, xi-1} and for any b in {xn/2+i,

xn/2+i-1 , …, xn/2}, it holds that a ≤ b.

2. for any a in {xi, xi+1, …, xn/2-1} and for any b in {xn/2+i, xn/2+i+1, …, xn-1} , it holds that a > b .

3. or any a in {x0, x1, …, xi-1} and for any b in {xi, xi+1, …, xn/2-1} , it holds that a ≤ b , and

4. for any a in {xn/2, xn/2+1, …, xn/2+i-1} and for any b in {xn/2+i, xn/2+i+1, …, xn-1}, it holds that a ≥ b.

Unique Crossover Property• The proof is much simpler than the statement.• Let i be the smallest index such that xi > xn/2+i

with 0 ≤ i ≤ n/2-1. • It then holds that for any 0 ≤ j ≤ i-1, x0 ≤ x1 ≤ ...

≤ xi-1 ≤ xn/2+i-1 ≤ xn/2+i-2 ≤... ≤ xn/2. • Question: Why?

Unique Crossover Property• The proof is much simpler than the statement.• Let i be the smallest index such that xi > xn/2+i

with 0 ≤ i ≤ n/2-1. • It then holds that for any 0 ≤ j ≤ i-1, x0 ≤ x1 ≤ ...

≤ xi-1 ≤ xn/2+i-1 ≤ xn/2+i-2 ≤... ≤ xn/2. • Question: Why? • So property (1) holds. • Question: Verify the other three properties.

From Bitonic to Arbitrary• Recall that every 1, or 2 element sequence is

also bitonic.• So, if we can create bitonic sequences of

larger length from these shorter bitonic sequences, we can use the earlier network.

• Question: If X and Y are two sorted sequences, how to create a bitonic sequence with elements of X and Y.

From Bitonic to Arbitrary• Recall that every 1, or 2 element sequence is

also bitonic.• So, if we can create bitonic sequences of

larger length from these shorter bitonic sequences, we can use the earlier network.

• Question: If X and Y are two sorted sequences, how to create a bitonic sequence with elements of X and Y.

• Answer: Concatenate X with Reverse(Y).

The Overall Sorting Network• Built in a bottom-up fashion as follows.• Consider sorting n/2 bitonic sequences of length 2

each. – Each uses the network designed earlier.– Each produces a bitonic sequence of length 4.

• Now, sort pairs of bitonic sequences each of length 4.

• Continue building up to two bitonic sequence of length n/2 each.

• Combine the two sequences to a single bitonic sequence of n elements.

• Finally, use the bitonic sort network.

Another Example – The Closest Pair

• Consider a set of n points in k-dimensions. • Let the distance between any two points be

their Euclidean distance.• The closest pair problem is to find a pair of

points whose pairwise distance is the smallest.

The Closest Pair Problem• For any k, one can always compute the

n(n-1)/2 pairwise distances, and take the smallest.

• This takes time O(kn2).– The k refers to the time taken to compute the

distance in a k-dimensional space.

• We will show today using divide and conquer that better solutions can be designed.

In One Dimension

• If k = 1, all points are on a line.• One solution is to sort the points, and compute

adjacent distances.• This takes O(nlog n) time.• Turns out that this is also the best one can achieve.• Unfortunately, the solution does not extend to

more than one dimension.

In One Dimension• Let us design a generic algorithm using the

divide and conquer strategy that also extends to any dimension (with minimal changes).

• The divide step intuitively seems to be to find the closest pair in the first n/2 points, and the next n/2 points.

• Since the subproblems can be solved recursively, we just have to focus on the combine step.

In One Dimension

In One Dimension• Let dl be the shortest distance in the first

subproblem.• Let dr be the shortest distance in the second

subproblem.

• The closest pair is either the pair at distance dl from the first subproblem, or at distance dr from the second subproblem,

In One Dimension• Let dl be the shortest distance in the first

subproblem.• Let dr be the shortest distance in the second

subproblem.

• The closest pair is either the pair at distance d l from the first subproblem, or at distance dr from the second subproblem,

• Or a pair that crosses the subproblems at a distance less than min{dl, dr}.

• How to quickly find if such a pair exists?

In One Dimension• In the one dimensional case, this is rather easy.• Find the largest point from the first subproblem

and the smallest point from the second subproblem. Call these as x and y.

• If y – x < min{dl, dr}, then x and y are the closest pair.

• Otherwise, the closest pair is the one with mutual distance min{dl, dr}.

In One Dimension• The algorithm is as follows:

Algorithm ClosestPair(A)Begin 1. Find the median point m. 2. Find the closest pair in the points that are less than m, recursively. 3. Find the closest pair in the points that are more than m, recursively. 4. Compute dl, dr from Solutions to problems in Steps 2 and 3 respectively. 5. Find the pair, and the distance d, with one point less than m and the other more than m. 6. Return the closest pair from Steps 2, 3, and 5.End.

In One Dimension• To analyze, notice that each subproblem takes

time T(n/2).

• The combine step takes O(n) time• Why?

• The recurrence relation is T(n) = 2T(n/2) + O(n).

• The solution is Q(nlog n), according to Masters theorem.

Extend to Two Dimensions• Problem 1: How to divide so that each

subproblem has n/2 points.• Problem 2: How to combine the solutions to

the subproblems.

Two Dimensions

• Can pick the median m of the x-coordinates and draw a vertical line y = m.

• Points to the left and the right of m define the subproblems.

Two Dimensions

• Can now find the closest points in each subproblem and denote the distances by dl and dr as earlier.

Two Dimensions• How many pairs do we have to consider

during the combine step.

Two Dimensions

• How many pairs do we have to consider during the combine step.

• Ans: n2/4

Two Dimensions• So, the recurrence now becomes T(n) =

2T(n/2) + O(n2).• With T(n) = Q(n2).• No savings accrued with the divide and

conquer approach.• Unless there is a clever combine step.

Two Dimensions• Notice that while there may be as many as

n2/4 pairs that are yet uncompared, there is some structure in them.

• Also, dl and dr throw additional light.• Need to consider pairs that are at a distance

less than d* = min{dl, dr}.• Intuition: As we move along increasing y

coordinates of points on one side, the distance from a fixed point P increases.

Two Dimensions• Fix a point P on one side of the line y = m.• How many points can pair up with P to have a

distance (with P) at most d*, and their pairwise distance at least d*.

• Why is the second condition true?

Two Dimensions

• Can approximate (Upper bound) the answer as follows.

• Consider a rectangle of height 2d* and width d* with at most six points on the boundary.– Actually six is quite an over count.

Two Dimensions• Call these as the friends of P.• For each P to the left of the line y = m, there

are six friends.• Can find these six distances in O(n) time.• With some care, we can actually find these

friends for all P in O(n) time.• So, the recurrence now is T(n) = 2T(n/2) +

O(n), with a solutio of Q(nlog n).

Two Dimensions• The clever step explained:• Project all points onto the line y = m.• Sort the points to the left of line and the right

of the line by their y-coordinate.• For each P on the left, find all the points

whose projection to y = m is at a distance at most d* from P.

• To get these sorted lists, can pre-sort once.

Higher Dimensions• The same solution extends to higher

dimensions.• Notice that the combine step is operating in

one dimension smaller that the input dimension.

• However, the runtime will be O(nlogd-1 n).• Can be improved to O(nlog n) with more

ideas.

Another Example – Skyline Points

• A point P dominates a point Q if the x- and the y-coordinates of P are larger than that of Q.

• Points that are NOT dominated by any other points are said to be maximal points, or skyline points

Skyline Points • Given a set of n points in a two dimensional

space, find all the skyline (maximal) points.• Use divide and conquer.

Skyline Points• Divide the points into two equal sized sets.

Skyline Points• Solve each part recursively.

Skyline Points• Combine the solutions (skylines)

Skyline Points• Slight asymptotic improvements possible.• Solve the right subproblem first, • Filter the points in the left subproblem that

are dominated by skyline points in the right subproblem.

• Solve only the remaining part of the left subproblem.

introduction to algorithms lecture 3 – divide and conquer

Documents

onlog n slide

methodology divide

time dn

tn2 time

recurrence relation

result of multiplication

original problem

subproblems of sizes