documenth2

Amit Shekhar

ID: 108266469

NetID: ashekhar

CSE 548 – Analysis of Algorithms

Solution to Assignment #2

Due Thursday, October 27, 2011

Discussed at Office hours and with Ayon Chakraborty and Amit Arya

Problem 1

In class we saw how to construct circuits for addition of two n-bit numbers that have depth of

O(log n) and O(1). In this problem we look at how to design circuits for multiplying.

Note: You may assume that the size of an addition circuit (both for standard and nonunique

representation) is Θ(n).

(a) Design a naıve circuit that has depth O(log2 n). What is the size of this circuit? Explain

how to compute the size.

Solution:

From the class, we establish that circuit for Addition of two n-bit numbers has depth of log n

using standard representation.

The recurrence relations for n-bit addition:

Depth: T (n) = T (n2 ) + 1 =⇒ O(log n)

Size: S(n) = 3S(n2 ) + n

=⇒ (32)logn ∗ n=⇒ O(nlog 3)

where n is the task done at Multiplexer and 3 came from the decision to compute more significant

half twice. Once when Cn2

= 1 and another time when Cn2

= 0

Here, we seek multiplication of two n-bit numbers, which is definitely an addition of n numbers

of 2n bits each. This is an adaptation of the simple mathematical multiplication. Here is how

it is so. The length of each number becomes 2n because after each bit is multiplied with the

multiplicand, the resultant (intermediate) number is shifted towards left by one bit. Since we are

multiplying n bits, we must have shifted one of the intermediate result to n bits to the left. Thereby,

making the length of each number to be added to 2n. Please note the extra bits can be padded.

Note that we have already established that circuit for Addition of two n-bit numbers has depth

of (log n)

Method:

Now we take each intermediate result in pair and add them.

repeat this process till we get the final value. This is analogous to MERGE part of a divide

and conquer algorithm. This gives the depth of circuit is log n for this part.

Hence, the depth of entire circuit becomes O(log2 n)

Size Computation:

If we assume O(nlog 3) to be equal to Θ(n) then Size would be O(n2) as there can be n nodes

for the tree shown. Each such node requires Θ(n) of circuit size.

(b) Design a better circuit that has depth O(log n). What is the size of this circuit? Explain

how to compute the size.

Solution:

From the class, we establish that circuit for Addition of two n-bit numbers has constant depth

using Non Unique representation.

The recurrence relations for n-bit addition:

Depth: T (n) = O(1)

Size: T (n) = O(n)

Here, we seek multiplication of two n-bit numbers, which is definitely an addition of n numbers

of 2n bits each. This is an adaptation of the simple mathematical multiplication. Here is how

it is so. The length of each number becomes 2n because after each bit is multiplied with the

multiplicand, the resultant (intermediate) number is shifted towards left by one bit. Since we are

multiplying n bits, we must have shifted one of the intermediate result to n bits to the left. Thereby,

making the length of each number to be added to 2n. Please note the extra bits can be padded.

we have already established that circuit for Addition of two n-bit numbers has constant depth.

Method:

Now we take each intermediate result in pair and add them. and repeat this process till we get

the final value. This is analogous to MERGE part of a divide and conquer algorithm. This gives

the depth of circuit is log n for this part.

Hence, the depth of entire circuit becomes O(log n)

Size Computation: for n nodes once again we get the circuit size as O(n2)

Problem 2

In this problem, we continue working on multiplication circuits. The objective is to reduce the

overall size of the multiplication circuits to something that is o(n2).

Hint: please use Problem 5 of Problem Set 1 as a guide.

(a) Please describe your construction algorithm and describe the depth of the circuit.

Solution:

The number can be written as: X = 2ma+ b and Y = 2mc+ d

Multiplication requires:

Z = 22mac+ 2m(bc+ ad) + bd

We see that (bc+ ad) can be represented by (a+ b)(c+ d)− (ac+ bd). This reduces the

number of multiplication circuit by 1. As multiplication can be performed by using circuits that

can perform multiplication of (a+ b)(c+ d), ac, bd only.

The image below shows the abstracted circuit of multiplication in this way:

(b) What is the recurrence relation for the size of your circuit?

Solution:

The recurrence relation:

T (n) = 3T (n2 ) +O(n)

(c) Please solve this recurrence relation.

Solution:

From Recursion Tree:

Work at level 1⇒ n

Work at level 2⇒ 31 × n21

......

Work at level n⇒ 3n × n2n

Depth of the tree = log2 n

Therefore, total work = Σn× (32)i ∀ i = 0 . . . log2 n

The terms are in Increasing geometric series hence, the last term would dominate. This also

means contribution from leaves will dominate. At the leaf level contribution is T (1) = Θ(1) (by

each leaf).

The number of leaves is 3log2 n ⇒ nlog2 3

Therefore, T (n) = Θ(nlog2 3)

Alternatively,

This recurrence relation falls under case 1 of Master Theorem

where T (n) = aT (nb ) + f(n)

with a = 3, b = 2, ε = 1, and f(n) = n

The solution is: T (n) = O(nlogb a)

= O(nlog2 3)

Problem 3

Imagine an infinitely large array A[0 . . .∞]. In this array,

A[0] = A[1] = . . . = A[n− 1] = 0,

and

A[n] = A[n+ 1] = A[n+ 2] = . . . = 1.

That is, the first n elements of the array are 0 and all subsequent elements are 1. In constant

time we can query the array to find the value of an array element A[i].

We do not know the value of n, the index of the first nonzero element.

A = 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 . . .

(1) Describe an algorithm to find the value of n. This algorithm should make at most O(log n)

queries.

Solution:

We query elements whose index is given by 2i where i ranges from 0 . . . log n. Here, it’s possible

that we may bump into the part where there are only 1s. In that case, we should return back

to immediately previous point [the index at 2(logn − 1) in the array] and then we should start

repeating above thing from that index till we reach the first 1.

Please note that first occurrence of 1 is at 2(i− 1) < n < 2i and i = log n

At max O(log n) queries.

(2) Describe an algorithm for finding any index m, where A[m] = 1 and m < nc for some constant

c > 1. Show that this algorithm requires O(log log n) array queries.

Solution:

we solve this problem in the same way as above except that we move forward in the array at a

step of 2(2i) where ranges from 0 . . . log log n

Proof for the number of queries required:

Here, we step by 22i

to reach to the part where there are all 1s. Hence, to reach n we step

log logn and query at each step whether we have reached at the required place.

(3) Describe an algorithm for finding any m, where A[m] = 1. How fast can you make this

algorithm?

Solution:

Once again, we can go forward to raise the step. The step can be any suitable 2222

i

based upon

infiniteness of the size of array. Note that the level of step can be chosen based upon the value of

n. If at all, it approaches to infinity then we can increase the step value to a big number. We can

do this in O(log ∗ n) and some constant time where we return back and check for the first 1.

Problem 4

Consider the knapsack problem discussed in class: For set S = {s1, s2, · · · , sn} of integers and

target integer K, is there a subset T ⊆ S, such that∑ti∈T

ti = K?

(1) Recall the algorithm from class. Provide the subproblems, then present pseudocode from class.

Solution:

Sub problem:

For every possible value of target integer K, we are going to check what subset of S can fit in

properly. In order to check for any intermediate K from 0 . . .K we try the entire set S. Though,

DP technique helps us to compute most of the bigger values using solution to smaller subproblem.

for all k = 0 . . .K

for all i = 1 . . . N

Knapsack problem for set {s1, s2, · · · , si}Pseudocode:

Initialization:


B[0, k] = 0

B[0, 0] = 0

Loop Assignment:

for all i = 1 . . . N


B[i, k] = B[i− 1, k]⋃B[i− 1, k − si]

Goal Cell:

Return B[n, k]

(2) The algorithm from (1) only answers the question with a yes or no. What part of the algorithm

would we modify so that we can compute the set T? What are the modifications?

Solution:

Pseudocode:

Initialization:


B[0, k] = 0

TAB[0, k] = 0 ¡— table for encoding regions

for all i = 0 . . . n

TAB[i, 0] = 0

B[0, 0] = 0

for all i = 1 . . . N


if(B[i− 1, k − si] > B[i− 1, k])

B[i, k] = B[i− 1, k − si]TAB[i, k] = 1

else

B[i, k] = B[i− 1, k]

TAB[i, k] = 0

The construction of subset from the Tab Matrix.

for all i = n . . . 1

if(TAB[i,K] == 1)

Print i th element from set.

K = K − siReturn B[n, k]

Problem 5 Problem from Steve Skiena

Consider the problem of storing n books on shelves in a library. The order of the books is fixed

by the cataloging system and so cannot be rearraged. Therefore, we can speak of a book bi, where

1 ≤ i ≤ n, that has a thickness ti and height hi. The length of each bookshelf at this library is L.

Suppose all the books have the same height h (i.e. h = hi = hj for all i, j) and the shelves are

all separated by a distance of greater than h, so any book fits on any shelf. The greedy algorithm

would fill the first shelf with as many books as we can until we get the smallest i such that bi does

not fit, and then repeat with subsequent shelves. Show that the greedy algorithm always finds the

optimal shelf placement, and analyze the time complexity.

Solution:

Problem : Show that the greedy algorithm always finds the optimal shelf placement

Proof by contradiction.

There are n books to be stored. Book bi has a thickness ti, where 1 ≤ i ≤ n, . All the books

have the same height h (i.e. h = hi = hj for all i, j). The length of each book shelf at the library is

L and the shelves are all separated by a distance of greater than h, so any book fits on any shelf.

Assume, there exists an optimal way other than greedy, and we can arrange the books such

that more number of books fit in a shelf. This infers, there should be some space left by the greedy

method because height is constant and order of books cannot be changed. So, even the alternate

solution ends up filling the first shelf with as many books as it can until it get the smallest i such

that bi does not fit, and then repeat with subsequent shelves. Again because of the condition that

height is constant and order of books cannot be changed.

Hence we conclude that greedy method is the same as the alternate optimal one.

The Recurrence relations is as follows :

T(n) =

= T (n− 1) + c

= (T (n− 2) + c) + c

= ((T (n− 3) + c) + c) + c...

= T (n− k) + k ∗ cT (n) is constant for n = 1 after a depth of n− 1.

Hence, the running time = O(n)

Problem 6 Problem from Steve Skiena

This is a generalization of the previous problem. Now consider the case where the height of the

books is not constant, but we have the freedom to adjust the height of each shelf to that of the

tallest book on the shelf. Thus the cost of a particular layout is the sum of the heights of the largest

book on each shelf.

(1) Give an example to show that the greedy algorithm of stuffing each shelf as full as possible does

not always give the minimum overall height.

Solution:

Consider books a,b,c,d having widths 3,6,3,6,3 and heights 11,14,17,5,6 respectively. Let the

width of the shelf be 13. In such a case, the Greedy Algorithm will order the books as:

Shelf 1: a, b → max height = 14

Shelf 2: c, d → max height = 17

Shelf 3: e → max height = 6

Shelf height will be = 37.

However, if we order the books as:

Shelf 1: a max height = 11

Shelf 2: b, c max height = 17

Shelf 2: d, e max height = 6

Hence, total height of all shelves = 34, which is better than the answer given by Greedy

Algorithm.

Thus, Greedy Algorithm does not always give optimal solution for minimizing the overall height

of the shelves.

(2) What technique should we use to solve this problem?

Solution:

Using Technique of dynamic programming, we can solve this problem.

(3) What are the subproblems?

Solution:

For a height, all the books that can fit in one self. Here, a possible set can start anywhere in

the list of books.

So, a subproblem may contain a list of books bi, bi + 1 . . . bj∀i < jandi, j ∈ n(4) How many subproblems are there?

Solution:

Upper limit for the number of subproblems can be nL. For every Shelf, we may have n sub

problems.

(5) Give an algorithm for this problem, and analyze its time complexity.

Solution:

Pseudo code:

Initialize:

OPT[][] be the DP memoization matrix initialized with a large value.

Loop and sub problem solving using the previously solved smaller problem

for all l = 0 . . . L

for all k = 1 . . . n− 1

if ((summation of thickness of all books from k . . . k+ l) ≤ Book Shelf length)

OPT [k, k + l] = max{OPT (k, k + l − 1), hk+l}else

Current subproblem doesn’t fit in. OPT [k, k + l] = very large value.

Goal Cell:

OPT [1, n]

Time complexity O(nL)

Problem 7

Suppose you are given three strings of characters: X, Y, and Z, where

|X| = n, |Y| = m, and |Z| = n+m.

Z is said to be a shuffle of X and Y iff Z can be formed by interleaving the characters from X and

Y in a way that maintains the left-to-right ordering of the characters from each string.

(a) Show that cchocohilaptes is a shuffle of chocolate and chips, but chocochilatspe is not.

Solution:

(b) Give an efficient dynamic-programming algorithm that determines whether Z is a shuffle of X

and Y.

Solution:

When we try to make the underlying DAG (Directed acyclic Graph) here, we see that each node is

being approached from either of the two ways. For any (i, j)th element it is reachable either from

(i, j − 1)th node or (i− 1, j)th node. If we start traversing from (0, 0) to the end (m,n) by above

mentioned ways and we actually reach to the end without violating any rule, then we’re done and

the 3rd string is a shuffle. Otherwise, the 3rd string is not a Shuffle.

Pseudo code:

A is the matrix that stores solution to previously computed smaller subproblems

for all i = 0 . . .M and j = 0 . . . N

if (X[i] == Z[i+ j])

A[i][j] = A[i][j − 1]

i← 1 + 1

else if (Y [j] == Z[i+ j])

A[i][j] = A[i− 1][j]

j ← j + 1

else

Return NOT A SHUFFLE!

Return A SHUFFLE!

Hint: The values the dynamic programming matrix you construct should be Boolean, not numeric.

Problem 8 Fun, challenging problem

There are N soldiers lined up, about to execute a hapless prisoner. The lieutenant, who begins

the firing process, is located at one end of the line. The soldiers all want to fire simultaneously.

Unfortunately, the soldiers can only talk to their immediate neighbors on the left and right. In

addition, these soldiers are not very intelligent and only have constant memory , independent of N .

Notice that constant memory is very little. It is not enough even to count up to N ; this requires

log(n) bits. It is not enough to have a name, because unique names also require log(n) bits.

Devise an algorithm that allows these soldiers to fire in unison. The algorithm should have running

time O(n).

Solution:

Here, the soldiers can only talk to their immediate neighbors on the left and right. This fact

gives us a way to solve. And we must devise a way such that everyone gets to hear the signal before

shooting. Every time, a soldier at the left most position can definitely talk with someone in the

mid of the row.

Firstly, devise a way how a soldier can talk to another soldier that is in the middle of the row

i.e, how to identify mid of the row. Here we can use basic reflection property. Say, soldier on the

left sends two signals towards right. One of the signal travels 3x faster than another. So, when the

slower signal reaches the mid, faster signal must also be reaching the mid point after traveling to

the end. So, this gives us the mid point for any row of any length.

Now, In order to enable everyone to signal we must start with the communication with first

element on the left and the mid of the row. The above mechanism can be employed to do this.

Next, we have two sub-list of equal size – Divided by the mid that we found just now. Now, by

treating the new sub-arrays the same way as we did to the original row we can make communication

happen between 1st soldier on the left and the soldier standing in the mid of 1st sub-array. When

we do it recursively to the end when each soldier talks to only its neighbor, we are in a position to

make each soldier talk in Unison.

Now the running cost analysis:

The first level communication requires 2n time

The next level communication requires 2n2 time

...

The subsequent level communication requires 2n2l

time

Hence the total work required still remains linear i.e, O(n)

Note that they move and interact at exactly the same speed; i.e., they are synchronized. Hint: Use

divide and conquer. You should assume soldiers all operate at the same speed.

Problem 9

In class we saw how to analyze divide and conquer matrix multiplication of two N ×N matrices.

Now we’ll show how to lay out the matrix to optimize for memory transfers (both using blocking

and divide and conquer). We’ll use the same DAM model that we described in class and in the last

problem set.

(a) Suppose that we divide the matrix into blocks of size B (i.e.,√B by

√B). Prove that we

can achieve O(N3/B3/2) memory transfers for a multiplication.

Solution:

The recurrence relation for the following method is:

T (n) = 8T (n2 ) +O(n2)

Solving the above recurrence relation, we get

T(n) = O(n3)

Now, a Block of size B can hold (√B ∗√B) elements. And a block B can be considered now

as a square sub matrix.

For a matrix of size N ∗N , there can be (N2

B ) such blocks as B is the unit of transfer and the

counting transfers. This also infers that the N ∗ N matrix can now be considered as ( N√B∗ N√

B)

square matrix. Therefore, as per our solution to the recurrence relation, the required number of

multiplication will be: ( N√B

)3.

In Big Oh notation it’ll be O( N3

B32

)

(b) Suppose that we divide the matrix into blocks of size M (i.e.,√M by

√M). Prove that we

can achieve O(N3/B√M) memory transfers for a multiplication.

Solution:

Here, a Block of size M can hold (√M ∗

√M) elements. And the total number of multiplication

will be ( N√M

)3

For each (√M ∗

√M) elements there will be M

B blocks. And the memory transfers will be

O(( N√M

)3 ∗ MB )

Upon simplification this becomes: O( N3

B∗√M

)

Problem 10 (challenging problem, but good practice for understanding divide and conquer well)

Now let’s explore divide and conquer solutions for matrix multiplication (as we discussed in

class). We want to multiply A and B (each of size N ×N) and the product will be stored in C:

A =

[A1 A2

A3 A4

]B =

[B1 B2

B3 B4

]C =

[C1 C2

C3 C4

]The divide and conquer solution from class is:[

A1 A2

A3 A4

] [B1 B2

B3 B4

]=

[A1B1 +A2B3 A1B2 +A2B4

A3B1 +A4B3 A3B2 +A4B4

]

To ensure good memory locality, we’ll store the matrix in the same divide and conquer order, e.g.,

A1, A2, A3, A4. Thus the matrix is stored in a zig-zag manner in memory.

(a) What is the recursion relation for the cost of a multiplication?

Solution:

The recurrence relation for the cost of multiplication:

T (n) = 8T (n2 ) +O(n2)

(b) Solve the recursion relation and prove that the divide-and-conquer multiplication achieves

O(N3/B√M) memory transfers.

Solution:

Solving the recurrence relation, we get

T(n) = O(n3)

Proof:

Now, a Block of size B can hold (√B ∗√B) elements. And a block B can be considered now

as a square sub matrix.

For a matrix of size N ∗N , there can be (N2

B ) such blocks as B is the unit of transfer and the

counting transfers. This also infers that the N ∗ N matrix can now be considered as ( N√B∗ N√

B)

square matrix. Therefore, as per our solution to the recurrence relation, the required number of

multiplication will be: ( N√B

)3. In Big Oh notation it’ll be O( N3

B32

)

Similarly, a Block of size M can hold (√M ∗

√M) elements. And the total number of multi-

plication will be ( N√M

)3 For each (√M ∗

√M) elements there will be M

B blocks. And the memory

transfers will be O(( N√M

)3 ∗ MB )

Upon simplification this becomes: O( N3

B∗√M

)

documenth2

Documents