randomized algorithms and probabilistic analysis of algorithms

Upload: mtx993

Post on 14-Apr-2018

247 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    1/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    5MD20Design Automation

    Randomized Algorithms and ProbabilisticAnalysis of Algorithms

    Phillip Stanley-M

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    2/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Lecture Outline

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    3/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    What are Randomized Algorithms and Analyse

    Randomized algorithms

    Algorithms that make random decisions during their execution Example: Quicksort with a random pivot

    Probabilistic analysis of algorithms Using probability theory to analyze the behavior of (randomized or deterministic) algorithms

    Example: determining the probability of a collision of a hash function

    Probability andComputation

    Randomized Algorithms Probabilistic Analysis

    of algorithms

    Monte Carloalgorithms

    Las Vegasalgorithms

    May fail or return anincorrect answer

    Always return rightanswer

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    4/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Why Randomized Algorithms and Analyses

    Why randomized algorithms? Many NP-hard problems may be easy to solve for typical inputs

    One approach is to use heuristics to deal with pathological inputs

    Another approach is to use randomization (of inputs, or of algorithm) to reduce the chance of worst-case behav

    Randomized Algorithms Probabilistic Analysis

    of algorithms

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    5/65Lecture: Randomized AlgorithmsTU/e 5MD20

    Why Randomized Algorithms and Analyses

    Why probabilistic analysis of algorithms? Naturally, if algorithm makes random decisions, performance is not deterministic

    Also, deterministic algorithm behavior may vary with inputs

    Probabilistic analysis also lets us estimate bounds on behavior; well talk about such bounds today

    Randomized Algorithms Probabilistic Analysis

    of algorithms

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    6/65Lecture: Randomized AlgorithmsTU/e 5MD20

    Theoretical Foundations

    Probability theory (things you covered in 2S610, 2nd year) Probability spaces

    Events

    Random variables

    Characteristics of random variables

    Combinatorics & number theory (some things you might have seen in 2D Many relations come in handy in simplifying analysis

    Algorithm analysis

    We will review relevant material in the next half hour

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    7/65Lecture: Randomized AlgorithmsTU/e 5MD20

    Lecture Outline

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    8/65Lecture: Randomized AlgorithmsTU/e 5MD20

    Probability Theory Refresher

    Probability space, (, ,), defines

    The possible occurrences (simple events), sets of occurrences (subsets of), and likelihood of occurrences

    Sample space, Composed of all the basic events we are concerned with

    Example: for a coin toss, = {H, T}

    Sigma algebra,

    Possible occurrences we can build out of Example: for coin toss, = {, , H, T} Events are members of

    Probability measure, A mapping from to [0, 1]

    Assigns probability (a real number p [0, 1]) to events

    One example of a probability measure is a probability mass function

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    9/65Lecture: Randomized AlgorithmsTU/e 5MD20

    Notation

    Event sets Will start today by representing events with sets, using letters early in the alphabet e.g., A, B, ...

    Events may be unitary elements or subsets of

    Probability Probability of event A will be written as Pr{A}

    e1

    e2

    e3

    e4

    e5

    e6

    e7

    e8

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    10/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Independence, Disjoint Events, and Union

    Two events, A and Bare said to be independent, iff

    Occurrence ofA does not influence outcome ofB Pr{AB} = Pr{A}Pr{B}

    Note that this is different form events being mutually exclusive If two events A and B are mutually exclusive, then Pr{AB} =

    For any two events E1 and E2

    Pr{E1E2} = Pr{E1} + Pr{E2} Pr{E1E2}

    Union bound (often comes in handy in probabilistic analysis)

    Pr i1

    Ei i1

    PrEi

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    11/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Conditional Probability

    Probability of event Boccurring, given A has occurred, Pr{B| A}

    Pr{B| A} = Pr{BA} Pr{A}

    If events A and Bare independent: Pr{BA} = Pr{B}Pr{A}

    Pr{B| A} = Pr{BA}

    Pr{A}

    Pr{B| A} = Pr{B}Pr{A} = Pr{B} Pr{A}

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    12/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Events and Random Variables

    So far, we have talked about probability and independence ofevents

    Rather than work with sets, we can map events to real values

    Random Variables A random variable is a function on the elements of the sample space, , used to identify elements of .

    Definition: A random variable,Xon a sample space is a real-valued function on ; i.e.,X: .

    We will only deal with discrete random variables, which take on a finite or countably infinite number of values

    Random variables define events The occurrence of a random variable taking on a specific value defines an event

    Example: Coin toss. LetXbe a random variable defining the number of heads resulting from a coin toss

    Sample space, = {H, T}, sigma algebra of subsets of, = {, , {H}, {T}} X: {, , {H}, {T}} {0, 1}

    Events: {X= 0}, {X= 1}

    In general, an event defined on a random variableXis of the form {s |X(s) = x}

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    13/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Notation

    Will represent random variables with uppercase letters, late in alphabet

    Example:X, Y, Z Will use the abbreviation rvarfor random variable

    Events Events correspond to a random variable, say,X, (uppercase) taking on a specific value, say, x(lowercase)

    Probability of rvarXtaking on the specific value xis written as Pr{X= x} or fX(x)

    Example: Coin toss LetXbe an rvar representing number of heads; Pr{X= 0} = fX(0) =(for a fair coin)

    R d V i bl I i i

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    14/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Random Variables Intuition

    So far, weve presented a lot of notation; can we gain more intuition ?

    Imagine a phenomenon, that can be represented w/ real values Example: the result of rolling a die

    LetXand Ybe functions mapping the result of rolling die to a number

    e.g.,X= die result, : {1, 2, 3, 4, 5, 6} or Y = 2(die result)+1 : {3, 5, 7, 9, 11, 13}

    Xand Yare two different functions (random variables) defined on the same set of events

    Each timeXtakes on a specific value is an event For the above die rolling example, with rvarsXand Y: Pr{X= 1} = Pr{Y= 3}, Pr{X= 4} = Pr{Y= 9}, and so on

    Ch t i ti f R d V i bl

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    15/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Characteristics of Random Variables

    Random variables and events1. We first talked about random phenomenon events in terms of sets

    2. We then introduced rvars, to let us represent events with real numbers

    3. When representing events with rvars, we can then look at some measures or characteristics of event phenomen

    Link to randomized algorithms and analyses; will reason about: Randomized algorithms in terms ofrvars characterizing actions of the algorithm

    Probabilistic analysis of algorithms in terms ofrvars characterizing properties of the alg. behavior given inp

    Randomized Algorithms Probabilistic Analysisof algorithms

    Ch t i ti f R d V i bl

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    16/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Characteristics of Random Variables

    Expectation or Expected Value, E[X], of an rvarX

    Properties ofE[X] Linearity

    Constant multiplier

    Question What is E[XE[X]] ?

    EX x

    xfxx EX i

    iPrX ior

    Ei1n

    Xi = i1n

    EXi

    Ec X = cEX

    C Di t Di t ib ti

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    17/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Common Discrete Distributions

    Uniform discrete All values in range equally likely

    = {a, ..., b}, = 2, : Pr{X=x} = 1/||

    Bernoulli or indicatorrandom variable Success or failure in a single trial

    = {0, 1}, = 2{0, 1} = {, {0}, {1}, }, : Pr{X=0} = p, Pr{X=1} = 1-p E[X] = p, Var[X] = p(1-p)

    Binomial Number of successes in n trials

    = n+1 = {0, 1, 2, ..., n}, = 2, : fX(k) = pk(1-p)n-k E[X] = np, Var[X] = np(1-p)

    Geometric Number of trials before first failure

    = = 2, : fX(k) = p(1-p)k-1 E[X] = 1/p, Var[X] = (1-p)/p2

    ( )nk

    U f l M th ti l R lt

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    18/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Useful Mathematical Results

    Some useful results from number theory and combinatorics well use lat

    i0

    ri= 1

    1r

    i1

    ri

    =r

    1r

    i0

    m

    ri= 1r

    m1

    1r

    1 -k

    n

    e

    - kn

    , when kis small compare

    For any y, 1+y ey

    i1

    n

    1i= ln(n) + O(1)

    Lect re O tline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    19/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Lecture Outline

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Probabilistic Analysis of Quicksort

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    20/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Quicksort

    Input: A list S= {x1, ..., xn} of n distinct elements over a totally ordered universeOutput: The elements ofSin sorted order

    1. If Shas one or zero elements, return S. Otherwise continue.2. Choose an element ofSas a pivot; call it x3. Compare every other element ofSto xin order to divide the other elements into two sublists

    a. S1 has all the elements ofSthat are less than x;b. S2 has all those that are greater than x.

    4. Apply Quicksort to S1 and S25. Return the list S1,x, S2

    Probabilistic Analysis of Quicksort

    Worst case performance is (n2) E.g., if input list in decreasing order and pivot choice rule is pick first element

    On the other hand, if pivot always splits Sinto lists of approximately equal size, performance is O(n log n)

    Question: Assuming we use the pick first element pivot choice, and input elements are chosen from a uniform discrete

    distribution on a range of values, what is the expected number of comparisons?

    i.e., letXbe an rvar denoting number of comparisons; what is E[X] ?

    Probabilistic Analysis of Quicksort

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    21/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Probabilistic Analysis of Quicksort

    Theorem.

    If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then

    the expected number of comparisons made by Quicksort is 2n ln n + O(n)

    Proof.

    Given an input set x1, x2, ..., xn chosen uniformly at random from possible permutations, lety1, y2, ..., yn be thsame values sorted in increasing order

    LetXijbe anindicator rvarthat takes on value 1 ifyiandyjare compared at any point in the algorithm, 0 othe

    for some i< j. The total number of comparisons, is the total number of timesXij=1

    LetXbe an rvar denoting the total number of comparisons of Quicksort. Then,

    = i1n1ji1

    n Xij and

    where weve used the linearityproperty introduced on slide 16

    E[X] = Ei1

    n1

    ji1n Xij

    = i1n1

    ji1n

    EXij

    Probabilistic Analysis of Quicksort

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    22/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Probabilistic Analysis of Quicksort

    Theorem.

    If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then

    the expected number of comparisons made by Quicksort is 2n ln n + O(n)

    Proof. (contd)

    SinceXijis an indicator rvar, E[Xij] is the probability thatXij= 1 (from slide 17). But recall thatXijis the event ttwo elementsyiandyjare compared.

    Two elementsyiandyjare compared iff either of them is first pivot selected by Quicksort from the set Yij =

    yi+1, ...,yj}. This is because if any other item in Yij

    were chosen as a pivot, since that item would lie between yyj, it would placeyiandyjon different sublists (and they would never be compared to each other).

    Now, the order in the sublists is the same as in original list (we are in process of sorting). From theorem, we al

    choose first element as pivot; since input is chosen uniformly at random from all possible permutations, any ele

    of the ordering Yijis equally likely to be first in the (random ordered) input sublist.

    Thus probability thatyioryj is selected as pivot, which is the probability thatyiandyjare compared, which is

    probability thatXij= 1, which is E[Xij], is (from definition of discrete uniform distribution on slide 17), 2/(j-i+1).

    Probabilistic Analysis of Quicksort

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    23/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Probabilistic Analysis of Quicksort

    Theorem.

    If the first list element is always chosen as pivot, and input is chosen uniformlyat random from all possible permutations of values in input support set, then

    the expected number of comparisons made by Quicksort is 2n ln n + O(n)

    Proof. (contd)

    Substituting E[Xij] = 2/(j-i+1) into the expression forE[X] form slide 21:

    E[X] = i1n1

    ji1n 2

    ji1

    = i1n1k2

    ni1 2k

    = k2ni1

    n1k 2k

    =k2n

    n1 k

    2

    k

    = n1k2n 2

    k 2n1

    = 2n2k1n 1

    k 4n

    = 2n lnn On from slide 18

    Randomized Quicksort

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    24/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Randomized Quicksort

    What if inputs are not the uniformly random selections of permutations?

    How to avoid pathological inputs? pick a random pivot!

    Analysis of number of comparisons is similar to foregoing analysis

    Theorem.

    Suppose that, whenever a pivot is chosen for Randomized Quicksort, it is

    chosen independently and uniformly distributed over all possibly choices.Then, for any input, the expected number of comparisons made by

    Randomized Quicksort is 2n ln n + O(n).

    Almost identical to proof of expected number of comparisons for deterministic Quicksort with randomized inpu

    Try doing this proof yourself as an exercise.

    Lecture Outline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    25/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Lecture Outline

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Tail Distribution Bounds

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    26/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Tail Distribution Bounds

    Weve seen one example of measures for characterizing a distribution Expectation, E[X] gives us an idea of the average value taken on by an rvar

    Another important characteristic is the tail distribution Tail distribution is the probability that an rvar takes on values far from its expectation

    Useful in estimating the probability of failure of randomized algorithms

    Intuitively, one may think of it as Pr{|X-k| a}

    We will now look at a few different bounds on tail distribution Loose bounds dont tell us much; they are often however easier to calculate

    Tight(er)bounds give us a narrower range on values, but often require more information

    Pr{X=x

    }

    x

    Pr{Xa}

    a

    Markovs Inequality

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    27/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Markov s Inequality

    A loose bound that is easy to calculate is Markovs inequality We can easily calculatePr{Xa}knowing only the expectation ofX

    This however often doesnt tell us much!

    We will use a similar argument in the Probabilistic Methodlater today

    Theorem [Markovs Inequality].

    LetXbe a random variable that assumes only nonnegative values.Then, for all a> 0, Pr{Xa} (E[X] /a)

    Proof.

    Fora> 0, let Ibe a Bernoulli/indicator random variable, with I= 1 ifXa, 0 otherwise. Since

    Xis nonnegative, IX/a. From slide 17, E[I] = Pr{I= 1} = Pr{Xa}, thus

    Pr{Xa} E[X/a] = E[X]/a(from slide 16).

    Moments

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    28/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    To derive at tighter bounds, we will need the idea of moments of an rva

    Definition: kth moment The kth moment of an rvarXis E[Xk] ,

    k= 0 is termed the first moment, and so on

    Definition: variance The variance of an rvarXis defined as Var[X] = E[(X E[X])2]

    Exercise: Show that Var[X] = E[X2] - (E[X])2

    Definition: standard deviation The standard deviation of an rvarX, is [X] = Var[X]

    Moments

    Chebyshevs Inequality

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    29/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Chebyshev s Inequality

    Now that we know about Var[X], we can introduce a tighter bound on ta

    Theorem [Chebyshevs Inequality].

    For any a> 0, Pr{|X E[X]| a} (Var[X] /a2)

    Proof.

    Pr{|X E[X]| a} = Pr{(X E[X])2a2}. Since (X E[X])2 is a nonnegative rvar, we canapply Markovs inequality to yield:

    Pr{(X E[X])2a2} E[(X E[X])2]/a2 = (Var[X] /a2).

    Lecture Outline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    30/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Lecture Outline

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Randomized Algorithm for Median, RM

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    31/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Randomized Algorithm for Median, RM

    Idea Find two nearby elements dand u, spanning a small set C, by sampling S

    Since |C| is o(n/log n), can sorted it in o(n) time using alg. that is O(klog k) for kelements

    The check in step 7 is to validate that the set Cis indeed small so that above assumption holds

    Randomized Median Algorithm

    Input: A set Sofn elements over a totally ordered universeOutput: The median element ofS, denoted m.

    1. Pick a (multi-)set R ofn3/4elements in S, chosen independently and uniformly at

    random, with replacement.

    2. Sort the set R.3. Let dbe the (n3/4 -n)th smallest element in the sorted set R.4. Let u be the (n3/4 +n)th smallest element in the sorted set R.5. By comparing every element in Sto dand u, compute the set C= {xS: dxu}

    and the numbers ld= |{xS: x< d}| and lu = |{xS: x> u}|6. If ld> n/2 orlu > n/2 then FAIL7. If |C| 4n3/4 then sort the set C, otherwise FAIL8. Output the (n/2- ld+ 1)th element in the sorted order ofC

    What is the probability that RM Fails?

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    32/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    What is the probability that RM Fails?

    What can go wrong? Sample might not be representative in terms of median:

    e1: Y1 = |{rR | rm}| < n3/4n too few elements in sample smallerthanm,

    e1: Y1 = |{rR | rm}| < n3/4n too few elements in sample largerthanm

    e3: |C| > 4n3/4 sample picked from Shas dandutoo far apart

    Pr{RM fails} = Pr{e1e2e3} = Pr{e1} + Pr{e2} + Pr{e3}, since the events e are disjoint

    Lets look at determining probability of event e1

    Reminder Bernoulli/Indicator and Binom

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    33/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Reminder Bernoulli/Indicator and Binom

    Bernoulli or indicatorrvar Success or failure in a single trial

    Example: Coin toss, with rvarX= 1 when heads,X= 0 when tails

    = {0, 1}

    Pr{X=0} = p, Pr{X=1} = 1-p

    E[X] = p

    Var[X] = p(1-p)

    Binomial rvar Number of successes in n Bernoulli trials of parameter p

    Sum of n Bernoulli(p) rvars is a Binomial(n, p) rvar

    = n+1 = {0, 1, 2, ..., n},

    fX(k) = pk(1-p)n-k

    E[X] = np

    Var[X] = np(1-p)

    ( )nk

    Determining Pr{e1}

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    34/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Determining Pr{e1}

    Lets define an indicator random variableXi

    Xiare independent since from definition ofRM, sampling is with replacement

    By definition, (n-1)/2 +1 elements in the input set Sto RM are smaller than median

    So, probability that a random sample is smaller than median is

    Y1 is an rvar representing # items (in sample R, of sizen3/4) smaller than median m

    We can therefore write Y1 in terms ofXi as

    X

    i 1 if the ith sample is m

    0 otherwise

    PrXi 1 n12 1

    n

    1

    2

    1

    2n

    Y1i1n34

    Xi

    By definition of RM alg

    Determining distribution of Y1

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    35/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    g

    Recall (slide 33) that sum of n Bernoulli(p) rvars is Binomial(n, p), so

    and

    Y1i1n34

    Xi

    fY1y n34

    y 12

    1

    2ny 1

    2

    1

    2nn

    34y

    VarY1 n34 121

    2n 1

    2EY1 n34 12

    1

    2n

    Determining Pr{e1}

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    36/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    g { 1}

    Back to determining Pr{e1} (recall: its one of events in which RM fails).. Pr{e1} = Pr{Y1 < n3/4n}

    Even though we can determine distribution of rvar Y1, determining Pr{Y1

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    37/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Chernoff Bounds

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    38/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Bounds are useful!

    We saw in previous example how knowing about the Chebyshev inequality helped us to quanswer questions about probability of failure of a randomized algorithm

    But, how tight are the bounds?

    Not all bounds tell us something useful

    Example: Pr{X= x} 1 is always true for any rvarXand value x, but it tells you nothing

    Chernoff bounds give us tighter bounds on Pr{|X-E[X]| > a}

    Pr{X=

    x}

    xa

    Loosebound

    1.0

    Tighterbound

    Pr{X= a} (if X is a discrete rvar)

    Chernoff Bounds

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    39/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Unlike Markov and Chebyshev inequalities, these are a class of bounds There are Chernoff bounds for different specific distributions

    Chernoff bounds are however all formulated in terms of moment generating functions

    Moment generating function for an rvarX, MX(t) = E[etX] MX(t) uniquely characterizes distribution

    We will be most interested in the property that E[Xn] = MX(n)(0)

    i.e. nth derivative ofMX(t) at t= 0 yields E[Xn]

    Example: Moment generating function for Bernoulli rvar

    (Recall: coin toss, heads or 1 with probability p, tails or 0 (1-p)):

    MXt EetX

    pet 11pet 0

    pet1p

    Chernoff Bounds

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    40/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    rX a PretX eta

    EetXeta

    Chernoff bounds generally make use of the ff. (from Markovs ineq., slid

    For a sequence of independent (but not necessarily i.i.d.) indicator rvar The following Chernoff bounds (which can be derived from the above) exist:

    For 0 < 1,

    For 0 < < 1,

    for t>0

    PrX a PretX

    EetXeta

    for t

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    41/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Problem: You have been asked to create a model of errors on a real communication interconnect

    At the high communication speeds, transmitted data may be subject to bit errors

    You want to estimate the probability of a bit error by measurement (e.g., eye diagrams):

    How many measurement samples do you need? Can you state a precise tradeoff between the accuracy of estimate and # of samples?

    Jitter

    Noise

    Superposed bit streams yield "eye-d

    "0's"Processing element(MSP430F2274)

    Interconnect; majority of interconnect

    routed on bottom layer of PCB

    53 mm

    102 mm

    measurement

    Chernoff Bounds Estimating a Paramete

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    42/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Estimating probability of bit error from n measurements Let pbe the probability we are trying to estimate, taking n measurements

    LetX= pn be the number of measurements in which we observe bit errors

    Ifn is sufficiently large, we expect pto be close to p

    Confidence interval A 1 - confidence interval for a parameter p is an interval [p-, p+] such that

    Pr{p [p-, p+]} 1 - i.e., Pr{np [n(p-), n(p+)]} 1 -

    If actual pdoes not lie in interval, i.e., p [p- , p+] If p< p, then X> n(p+ ) (sinceX= np)

    If p> p+ , then X< n(p)

    We can apply the Chernoff bounds for Binomial we showed

    earlierX= npis the number of observed errors in n measurements is Binomial(n, p) distr

    ~

    ~

    ~ ~

    ~ ~ ~ ~

    ~

    ~

    ~

    ~

    Chernoff Bounds Estimating a Paramete

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    43/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Prp p , p PrX np1 p PrX np1

    en2

    2pen2

    3p

    en2

    2 en2

    3

    Applying Chernoff bounds

    So, probability that the real pis less than away from estimated p,

    can be set by performing an appropriate minimum # of measurements, n

    Example: = 0.95, = 0.01n 95,430 measurements

    (since p 1 by definition of prob

    (applying the Binomial Chernoff b

    en2

    2 e

    Other Applications of Parameter Estimati

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    44/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Derive Chernoff bounds for distribution at hand

    You cant always assume underlying distribution is Gaussion/normal

    Semiconductor process / device models An important part of the modern IC design flow

    Diminishing device feature sizes (~100s atoms per transistor at 45nm); statistical models

    Semiconductor fabrication companies (fab houses) use test chips to characterize proce

    How many test structures does one need to get a certain confidence in parameter estima

    More applications Characterizing probability of device failures: how many measurements do you need?

    Characterizing Probability of Device Failu

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    45/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Radioactive Decay of238

    U and232

    Thfrom device packaging mold resin,

    210Po from PbSn solder (and Al wire)

    12C

    !-particles!- raysLithium

    Cosmic rays Thermal neutrons

    High energy neutron(can penetrate up to

    5 ft of concrete)

    Neutron capture within Si

    and B in integrated circuits

    Unstable isotope

    Magnesium

    or

    Possible interaction paths

    Circuit state disturbance inducement

    Micropr

    electrica

    Secondary ions and energetic particles may genelectron-hole pairs in silicon; these may migra

    through device and aggregate, creating currepulses that lead to changes of logic state.

    +

    +

    temperatureuctuations

    }LD@

    (R4),R2

    Program

    !x.+2x

    More Applications of Randomized Algs.

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    46/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Hashing: can use the basic tools introduced in the last two lectures to Determine the expected number of items in a bin

    Bound on the maximum number of items in a bin

    Probability of false positives when using hash functions with fingerprints

    Applicable to many areas of design automation (you will see example later in this course)

    Approximate set membership: Bloom filters Use probabilistic analysis to determine tradeoff b/n space and false positive probability

    Hamiltonian cycles Monte Carlo algorithms (will return a Hamiltonian cycle or failure)

    Lecture Outline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    47/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    The Probabilistic Method

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    48/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    A method for proving the existence of objects

    Why is it relevant ? The proofs are of a form that enables them to guide the creation of a randomized algorithm for finding the des

    object

    Basic idea: Construct a sample space such that the probability of selecting the desired object is > 0. (if the probability of picking the desired element is > 0, then the element must exist.)

    Alternatively: an rvarXmust take on at least one value E[X], and at least one value E[X]

    Other approaches: second moment method, Lovasz local lemma

    The Probabilistic Method: Example

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    49/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    A multiprocessor module (left) and its logical topology (right) We want a grouping of the hardware into two sets, with a maximum number of connecting links

    cpu

    6

    0210

    cpu

    2

    0120

    cpu

    11

    1021

    cpu

    12

    1201

    cpu

    0

    010

    1

    cpu

    8

    1010

    cpu

    15

    121

    2

    cpu

    1

    0102

    cpu

    16

    201

    0

    cpu

    23

    2121

    cpu

    17

    2012

    cpu

    7

    0212

    cpu

    22

    2120

    cpu

    21

    2102

    cpu

    4

    0201

    cpu

    5

    0202

    cpu

    10

    1020

    cpu

    13

    1202

    cpu

    18

    2020

    cpu

    19

    2021

    Processing element

    (MSP430F2274)

    Interconnect; majority of interconnect

    routed on bottom layer of PCB

    53 mm

    102 mm

    The Probabilistic Method: Example

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    50/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    There may also be restrictions on valid topologies due to layout constrai

    We can reformulate this as finding the Maxcut of the topology graph

    Maxcut: a cut of graph of maximum weight; an NP-hard problem

    Well use the probabilistic method to prove that a cut with certain properties exists

    Well then turn proof into a randomized algorithm for finding the desired topology

    PartitionA

    Partition B

    This partitioning does notyield the largest number of

    links for a cut of the topology

    The Probabilistic Method: Example

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    51/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    How we will approach this problem:1. Problem: topology partitioning for fault-tolerance

    2. Restate as a Maxcut problem3. Existence prooffor a maxcut of value at least m/2

    4. Conversion of proof into a simple randomized algorithm

    Probabilistic Method: Problem Proof

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    52/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Theorem [Maxcut].

    Given any undirected graph G= (V, E), with n vertices and m edges, there is apartition ofVinto two disjoint sets A and B, such that at least m/2 edges

    connect a vertex in A to a vertex in B, i.e., there is a cut with value at leastm/2.

    Proof.

    Construct sets A and Bby randomly and independently assigning each vertex to one of the twLet e1, ..., em be an arbitrary enumeration of the edges in G. Fori= 1, ..., m, defineXisuch t

    Pr{edge eiconnects a vertex in A to a vertex in B} = 1/2 (since we split the vertices into trandomly).Xi is therefore an Bernoulli/indicator rvar with p= 1/2 and E[Xi] = p= 1/2.Let C(A, B) be an rvar denoting the value of the cut between A and B. Then,

    Since E[C(A, B)] = m/2, there must be at least one value ofC(A, B) m/2.

    Xi 1 if edge i connects A to B

    0 otherwise

    E

    C

    A, B

    E

    i1m

    Xi

    =

    i1m

    E

    Xi

    m

    2

    Probabilistic Method: Proof Algorithm

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    53/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Basic procedure Monte Carlo or Las Vegas algorithm Repeat basic procedure a fixed number of times; return best m/2 cut or FAIL (MC)

    Or, repeat procedure until we find an m/2 cut (LV)

    What is the expected number of tries before we find a cut with value m We can use this as guide for number of times to repeat basic steps until we find a Maxcut

    or FAIL (i.e., to direct a Monte Carlo algorithm)

    Randomized Maxcut

    Input: A graph Gwith n vertices and m edgesOutput: A partition ofG, into two sets A sets Bsuch that at least m/2 edges connect A and B.

    1. Randomly choose a partition. This can be done in linear time by scanning through vertices

    and flipping a fair coin to pick destination set as A orB.2. Check whether the selected cut is at least m/2, by counting edges crossing the cut

    (polynomial time).

    Probabilistic Method: Algorithm Performa

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    54/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Expected number of tries before we find a cut with value m/2

    Let p= Pr{C(A, B) m/2}

    The value of a cut cannot be more than the number of edges, i.e., C(A, B) m

    Previous proof showed that E[C(A, B)] = m/2, so,

    Recall, geometric probability distribution

    # trials before first failure, or, # trials before first success = , fX(k) = p(1-p)k-1, E[X] = 1/p

    Expected number of tries before we find a cut is 1/p, i.e., at least m/2

    m2 ECA, B

    im21

    iPrCA, Bi im

    2

    iPrCA, Bi

    1p m21 p m

    p1

    m21

    The Probabilistic Method Example Recap

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    55/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    A method for proving the existence of objects

    Why is it relevant The proofs can be used to guide construction of a randomized algorithm

    There are also techniques to turn proofs into a deterministic algorithmsderandomization

    What we just saw1. A problem: topology partitioning for fault-tolerance

    2. Restated as a Maxcut problem

    3. Existence prooffor a Maxcut of value at least m/24. Constructed a simple randomized algorithm based on proof

    5. Analysis of the expected running time of the randomized algorithm

    Question: was the algorithm Monte Carlo or Las Vegas

    Lecture Outline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    56/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Hashing

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    57/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Hash tables Data structure that enables, on average, O(1) insertion and lookups

    Useful when one would like to maintain a set of items, with fast lookup

    Notation Top-level table/array, T[]

    Element for insertion in hash table, x, from a set Uof possible elements

    Key, k, is an identifier for x; assume we can easily map elements to integer keys

    Hash function h(key[x]) specifies index in T[] where element xshould be stored

    Assumptions Simple uniform hashing any element equally likely to hash to any slot

    That is, h(key[x]) distributes the xelements uniformly at random over slots in T[]

    Populating the Hash Table

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    58/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Simplest approach: direct addressing One element in T[] for each hash key when we can afford the space cost

    May make sense when number of keys to be storedis approx. number of possible keys, |U|

    Collisions Want T[] to have about as many elements as well insert, n (not as many as exist, |U|)

    Want h() to map larger set with |U| elements, to m slots

    Since m < |U|, it is possible to have multiple elements hash to same slot

    Can resolve collisions with two different approaches: chain hashing or open addressing

    Chain Hashing Keep items that hash to the same slot in a linked list or chain

    Will now need to search through chain for insert/delete/lookup

    The ratio = n/m is called the load

    0 1 2 3 5 9x1, x2, ..., x6 = {2, 0, 3, 1, 9, 5}0 1 2 3 4 5 6 7 8 9

    bin or slot

    x U = {0, ... 9}

    Expected Search Time in Chain Hashing

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    59/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Expected # of comparisons(assume new elements added to head of chain, simple unifo If element is not already in hash table (compare to all elements in bin h(key(x))): (1+)

    If element is in hash table (stop when we find element in bin h(key(x))): (1+):

    Proof

    Assume element we seek is equally likely to be any of the n elements in table. Number of elexamined in lookup for element xis Lx= 1 + number of elements in bin h(key(x)) beforeelements seen in chain before xare were added afterxwas.

    Now, we can find avg. Lx by calculating expected value over the n possible elements in ta

    Let xidenote ith element inserted into table, i= 1, ..., n, and ki= key(xi). Define an indicato

    , and E[Xij] = 1/m. ThusXij1 hki hkj, with probability 1m0 otherwise

    E 1ni1

    n

    Lx E 1n i1

    n

    1 Xijji1

    n 1ni1

    n

    1 EXijji1

    n

    1 1nm

    ni1

    n

    1 1nmn2nn12 1n12m1 2 2nNot aconstant

    Hash functions and Universal Hashing

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    60/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Universal hashing At runtime, pick the hash function that will be used at random...

    ... from a family of universal hash functions

    Universal hashing gives good average case behavior If key kis in table, expected length of chain containing kis at most 1 +

    Definition [Universal Hash Function].

    A finite collection, , of hash functions that map a given universe Uof keys into

    the range {0, 1, ..., m-1} is said to be universal, if for each pair of distinct keys k, l,

    U, the number of hash functions h for which h(k) = h(l) is at most ||/m

    Other Forms: Perfect Hashing, Bloom Filt

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    61/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Perfect hashing Uses two levels of hashing w/ universal hash functions: second level hashing upon collision

    Can guarantee no collisions at second level

    Unlike other forms of hashing, worst-case performance is O(1)

    Bloom filters Tradeoff between space and false positive probability

    ...

    For each element xi, to be inserted, calculate khashes:

    T[h1(x0)] 1

    T[hk(x0)] 1

    Calculate khashes of e

    T[h1(x)] = 1?, and

    ...

    and T[hk(x)] = 1? theny

    Insertion: Checking:

    T[h1(xi)] 1

    T[hk(xi)] 1

    ,0 1 1 0 1 0 0 1T:...

    Afterkhashes, probability of a given element ofT[] being zero is

    If we assume some elements still zero, probability of a false positive is then1

    1

    n

    km

    11 1nk

    Other Forms: Open Addressing

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    62/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    All elements stored in the top-level tableT[] itself No chaining

    1 since hash table can get full once its m slots are taken by elements Upon a collision, hash function defines next slot to probe until an empty slot is found

    Advantages No need for pointers used in chaining: may have more slots for same memory usage

    Disadvantages

    Entry deletion is complicated: cant simply remove entry as it will affect probe sequence

    Probe sequence strategies Linear probing, quadratic probing, double hashing

    Lecture Outline

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    63/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Motivation

    Probability Theory Refresher

    Example Randomized Algorithm and Analysis

    Tail Distribution Bounds

    Example Application of Tail Bounds

    Chernoff Bounds

    The Probabilistic Method

    Hashing

    Summary of Key Ideas

    Summary Why randomized algorithms and analyses ?

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    64/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Why randomized algorithms and analyses ? Analysis ofalgorithms that make use of randomness

    Analysis ofalgorithms in the presence of random input

    Designing algorithms that avoid pathological behavior using random decisions

    Probability review Probability space, events, random variables

    Characteristics of random variables: expectation, moments

    Randomized algorithms and Probabilistic analysis

    Tail distribution bounds Markov inequality, Chebyshev inequality, Chernoff bounds

    The Probabilistic Method Proofs algorithms

    Hashing example and analysis

    Probing Further...

  • 7/30/2019 Randomized Algorithms and Probabilistic Analysis of Algorithms

    65/65

    Lecture: Randomized AlgorithmsTU/e 5MD20

    Books Kleinberg & Tardos chapter 13

    Randomized Algorithms (Motwani and Raghavan) Probability and Computing (Mitzenmacher and Upfal)