algorithms and data structures (ibc027) - radboud …€¦ · algorithms and data structures...

Algorithms and Data Structures (IBC027)

January 19, 2017

You are allowed to answer in Dutch. Whenever an algorithm is required, it can be given inpseudocode or plain English (or Dutch), and its running time and correctness must always bejustified (even informally, but in a clear way!). The grade equals the sum of the scores for the sixproblems below plus 1 (plus possibly a bonus score for the homework assignments). Success!

1 Quiz (1.5 point)

Specify, for each of the following statements, whether it is true or false, together with an explana-tion:

1. n100 = Ω(n).

2. If the DFS finishing time f [u] > f [v] for two vertices u and v in a directed graph G, and uand v are in the same DFS tree in the DFS forest, then u is an ancestor of v in the DFStree.

3. Dijkstra’s algorithm works on any graph without negative weight cycles.

4. The sequence 〈20, 15, 18, 7, 9, 5, 12, 3, 6, 2〉 is a max-heap.

5. Given a hash table with more slots than keys, and collision resolution by chaining, the worstcase running time of a lookup is constant time.

2 Graph algorithms (1.5 point)

Some of your friends study biology and do a reseach project in the Millingerwaard. They returnwith specimens of butterflies, but it is very difficult for them to tell how many distinct speciesthey’ve caught — thanks to the fact that many species look very similar to one another.

They have collected n butterflies and believe that each belongs to one of two different species,which we’ll call A and B for purposes of this discussion. They’d like to divide the n specimensinto two groups — those that belong to A, and those that belong to B — but it’s very hard forthem to directly label any one specimen. So they decide to adopt the following approach.

For each pair of specimens i and j, they study them carefully side-by-side; and if they’reconfident enough in their judgment, then they label the pair (i, j) either “same” (meaning theybelieve them both to come from the same species) or “different” (meaning they believe them tocome from opposite species). They also have the option of rendering no judgment on a given pair,in which case we’ll call the pair ambiguous.

So now they have the collection of n specimens, as well as a collection of m judgments (either“same” or “different”) for the pairs that were not declared to be ambiguous. They’d like to knowif this data is consistent with the idea that each butterfly is from one of species A or B; so moreconcretely, we’ll declare the m judgments to be consistent if it is possible to label each specimeneither A or B in such a way that for each pair (i, j) labeled “same,” it is the case that i and j havethe same label; and for each pair (i, j) labeled “different,” it is the case that i and j have oppositelabels. They’re in the middle of tediously working out whether their judgments are consistent,

when one of them realizes that you probably have an algorithm that would answer this questionright away.

Give an algorithm with running time O(m+n) that determines whether the m judgments areconsistent.

3 Data structures (1.5 points)

As discussed in class, one way to represent a finite set is as a directed tree. Nodes of the treeare elements of the set, arranged in no particular order, and each node has parent pointers thateventually lead up to the root of the tree. This root is a convenient representative of the set, anddistinguished from the other elements by the fact that its parent pointer is a self-loop. The union-find data structure uses this representation of trees, and supports three operations Makeset(x),which builds a set with a single element x, Find(x), which finds the representative of the setcontaining x, and Union(x, y), which builds the union of the sets containing x and y.

We consider two implementations of the union-find data structure. In the union-by-weightimplementation, whenever we Union two sets, the representative of the smaller set becomes a childof the representative of the larger set (breaking ties arbitrarily):

Union(x,y):

Makeset(x): rx = Find(x)

parent(x) = x ry = Find(y)

weight(x) = 1 if weight(rx)>weight(ry)

parent(ry) = rx

Find(x): weight(rx) = weight(rx) + weight(ry)

while x != parent(x) else

x = parent(x) parent(rx) = ry

return x weight(ry) = weight(rx) + weight(ry)

In the union-by-rank implementation, we make the root of the tree with smaller height point tothe root of the tree with larger height (again breaking ties arbitrarily):

Union(x,y):

Makeset(x): rx = Find(x)

parent(x) = x ry = Find(y)

rank(x) = 0 if rank(rx)>rank(ry)

parent(ry) = rx

Find(x): else

while x != parent(x) parent(rx) = ry

x = parent(x) if rank(rx) = rank(ry)

return x rank(ry) = rank(ry) + 1

1. Give the tree (or array) that results from the following sequence of operations, starting fromsingleton sets 0, . . . , 12

Union(1,2) Union(3,4) Union(3,5) Union(1,7) Union(3,12)

Union(0,9) Union(8,10) Union(8,9) Union(7,4) Union(2,9)

using first the union-by-weight and then the union-by-rank implementation.

2. Consider the union-by-rank implementation. Let n be a power of 2. Give a sequence of mMakeset, Find and Union operations, including n Makeset operations, that take Ω(m log n)time.

3. Consider the union-by-weight implementation. Let x be contained in a set with weight nthat is constructed by a series of Makeset and Union operations. Prove that the worst-caserunning time of Find(x) is O(log n).

2

4 Greedy algorithms (1.5 points)

Consider a container with capacity W together with a collection of n different materials. Thevalue per unit of the i-th material is vi, and the total available quantity of the i-th material is wi.The problem is to choose an amount xi ≤ wi of each material such that it fits in the container∑

i

xi ≤W,

while maximizing the total value ∑i

xivi.

(We assume all quantities are nonnegative.) Give an O(n log n) algorithm that solves this problem.

5 Randomized algorithms (1.5 points)

Suppose we have a system with n processes. Certain pairs of processes are in conflict, meaningthat they both require access to a shared resource. In a given time interval, the goal is to schedulea large subset S of the processes to run — the rest will remain idle — so that no two conflictingprocesses are both in the scheduled set S. We’ll call such a set S conflict-free.

One can picture this process in terms of a graph G = (V,E) with a node representing eachprocess and an edge joining pairs of processes that are in conflict. Finding a maximum-sizeconflict-free set S, for an arbitrary conflict G is difficult (this is an NP hard optimization problem,a concept that will be discussed in the course Complexiteit). Nevertheless, we can still look forheuristics that find a reasonably large conflict-free set. Moreover, we’d like a simple method forachieving this without centralized control: each process should communicate with only a smallnumber of other processes, and then decide whether or not it should belong to the set S.

We will suppose for purposes of this question that each node has exactly d neighbors in thegraph G. (That is, each process is in conflict with exactly d other processes.)

1. Consider the following simple protocol.

Each process Pi independently picks a random value xi; it sets xi to 1 with prob-ability 1

2 and set xi to 0 with probability 12 . It then decides to enter the set S if

and only if it chooses the value 1, and each of the processes with which it is inconflict chooses the value 0.

Prove that the set S resulting from the execution of this protocol is conflict-free. Also, givea formula for the expected size of S in terms of n (the number of processes) and d (thenumber of conflicts per process).

2. The choice of the probability 12 in the protocol above was fairly arbitrary, and it’s not clear

that it should give the best system performance. A more general specification of the protocolwould replace the probability 1

2 by a parameter p between 0 and 1, as follows:

Each process Pi independently picks a random value xi; it sets xi to 1 with prob-ability p and set xi to 0 with probability 1− p. It then decides to enter the set Sif and only if it chooses the value 1, and each of the processes with which it is inconflict chooses the value 0.

In terms of the parameters of the graph G, give a value of p so that the expected size of theresulting set S is as large as possible. Give a formula for the expected size of S when p isset to this optimal value.

3

6 Dynamic programming (1.5 points)

You are given an n-by-n grid, where each square (i, j) contains c(i, j) gold coins. Assume thatc(i, j) ≥ 0 for all squares. You must start in the upper-left corner and end in the lower-rightcorner, and at each step you can only travel one square down or right. When you visit any square,including your starting or ending square, you may collect all of the coins on that square. Give analgorithm to find the maximum number of coins you can collect if you follow the optimal path.

4

algorithms and data structures (ibc027) - radboud …€¦ · algorithms and data structures...

Documents