probability

2
CS648 : Randomized Algorithms Semester II, 2014-15, CSE, IIT Kanpur Assignment - 5 (due on 27th March 6:00 PM) Note: Be very rigorous in providing any mathematical detail in support of your arguments. Also mention any Lemma/Theorem you use. 1. Non-Dominated points Let P = {(x i ,y i )|1 i n} be a set of n points in plane. Assume all coordinates are diffierent. A point (x i ,y i ) is said to dominate another point (x j ,y j ) if x i >x j and y i >y j . A point in P is said to be a non-dominated point if there is no point in P which dominates it. See Figure 1 for a better understanding of non-dominated points. Figure 1: The non-dominated points appear in a stair-case patterns You have to design an expected O(n log n) time randomized algorithm to compute all non-dominated points in P . The algorithm has to be based on Randomized Incremental Construction technique. Provide complete details of the conflict graph data structure used for this problem. Provide the detail of each incremental step and analyze its time complexity using backward analysis only. Note: This problem has many efficient determinstic algorithms. However, the aim of this problem in the assignment is to test your skills of randomized incremental construction and conflict graph data structure. 2. A probability gem as an application of backward analysis If we select k points independently at random from interval [0, 1], we get a partition of the interval into k +1 sub-intervals. In the class we showed that each sub-interval has identical probability distribution and its expected size is 1/(k + 1). Now we shall try to calculate the expected value of the smallest of the k + 1 intervals. This is a problem in continuous probability and since most of you are not so comfortable with continuous probability, we shall solve its discrete version : Given a set of n integers 1, 2, ...n, we select k integers (k is much smaller than n) from this set one after another, independently and without replacement. In this way, the set gets partitioned into k +1 intervals defined by the sampled points. For example, if n = 10 and we select two numbers : 4 and 8, then there are three intervals [1, 4], [4, 8], [8, 10]. The smallest interval would be [8, 10] and its length is 2. 1

Upload: vipul-gupta

Post on 16-Nov-2015

9 views

Category:

Documents


4 download

DESCRIPTION

Probability Problems

TRANSCRIPT

  • CS648 : Randomized Algorithms

    Semester II, 2014-15, CSE, IIT Kanpur

    Assignment - 5 (due on 27th March 6:00 PM)

    Note: Be very rigorous in providing any mathematical detail in support of your arguments. Also mentionany Lemma/Theorem you use.

    1. Non-Dominated points

    Let P = {(xi, yi)|1 i n} be a set of n points in plane. Assume all coordinates are diffierent. Apoint (xi, yi) is said to dominate another point (xj , yj) if xi > xj and yi > yj . A point in P is saidto be a non-dominated point if there is no point in P which dominates it. See Figure 1 for a betterunderstanding of non-dominated points.

    Figure 1: The non-dominated points appear in a stair-case patterns

    You have to design an expected O(n log n) time randomized algorithm to compute all non-dominatedpoints in P . The algorithm has to be based on Randomized Incremental Construction technique.Provide complete details of the conflict graph data structure used for this problem. Provide the detailof each incremental step and analyze its time complexity using backward analysis only.

    Note: This problem has many efficient determinstic algorithms. However, the aim of this problem inthe assignment is to test your skills of randomized incremental construction and conflict graph datastructure.

    2. A probability gem as an application of backward analysis

    If we select k points independently at random from interval [0, 1], we get a partition of the interval intok+1 sub-intervals. In the class we showed that each sub-interval has identical probability distributionand its expected size is 1/(k + 1). Now we shall try to calculate the expected value of the smallestof the k + 1 intervals. This is a problem in continuous probability and since most of you are not socomfortable with continuous probability, we shall solve its discrete version :

    Given a set of n integers 1, 2, ...n, we select k integers (k is much smaller than n) from this set oneafter another, independently and without replacement. In this way, the set gets partitioned into k + 1intervals defined by the sampled points. For example, if n = 10 and we select two numbers : 4 and 8,then there are three intervals [1, 4], [4, 8], [8, 10]. The smallest interval would be [8, 10] and its lengthis 2.

    1

  • Let random variable Xi denote the length of the smallest interval when i integers are selected accordingto the way described above. It can be observed that Xi+1 is either Xi or smaller than Xi. We shall showthat the expected length of the smallest interval is (n/k2) by using two different ways to calculatethe probability of event Ei+1 - Xi+1 < Xi. Here is a sketch.

    (a) Using Backward analysis, compute the probability of event Ei+1.

    (b) Conditioned on Xi = r, show that the probability of event Ei+1 is at least(i+1)(r1)

    nand at most

    2(i+1)(r1)n

    . In other words,

    (i + 1)(r 1)

    n P[Xi+1 < Xi|Xi = r]

    2(i + 1)(r 1)

    n(1)

    (c) Use (a) and (b) carefully to conclude that

    E[Xi] = ( n

    i2

    )

    There is a hint given at the end of the assignment sheet. Try not to use this hint. At least firstmake a sincere attempt to solve without using the hint.

    3. Estimating all-pairs distances

    Note: This is a variant of the 5th problem of the mid-semester exam. Hardly 2 or 3 students could

    solve the problem. I discussed the solution in the class. This problem is an opportunity to see if you

    really internalized the solution.

    Consider an undirected un-weighted graph G on n vertices. For simplicity, assume that G is connected.We are also given a partial distance matrix Mc for some c < 1 : For a pair of vertices i, j the entryMc[i, j] stores exact distance if i and j are separated by distance cn, otherwise Mc stores a symbol# indicating that distance between vertex i and vertex j is greater than cn. Unfortunately, there are(n2) # entries in Mc, i.e., for (n

    2) pairs of vertices, the distance is not known. Design a MonteCarlo algorithm to compute exact distance matrix for G in O(n2 logn) time. (Each entry of the dis-tance matrix has to be correct with probability exceeding 1 1/n2).

    Hint for 2nd problem: Using Inequality (1), try to get an upper and lower bound on unconditionalprobability of event Ei+1 in terms of expected value of Xi.

    2