probabilistic analysis and randomized algorithm. worst case analysis probabilistic analysis need...

31
Probabilistic Analysis and Randomized Algorithm

Upload: nathaniel-lambert

Post on 17-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Probabilistic Analysis and Randomized Algorithm

Worst case analysis Probabilistic analysis

Need the knowledge of the distribution of the inputs

Indicator random variables Given a sample space S and an event A, the indicator

random variable I{A} associated with event A is defined as: 10 if occurs

o/wAI A

E.g.: Consider flipping a fair coin: Sample space S = { H,T } Define random variable Y with Pr{ Y=H } = Pr{ Y=T }=1/2 We can define an indicator r.v. XH associated with the

coin coming up heads, i.e. Y=H

10 if if H

Y HX I Y HY T

1 Pr 0 Pr

1Pr

2

HE X E I Y HY H Y T

Y H

{ }

:

:

Pr

1 Pr 0 Pr

Pr

A

A

A

S AS X I A

E X A

E X E I A A A

A

Lemma

Proof

Given a sample space and an event in thesample space , let Then

Hire-Assistant(n)

1. best = 0

2. for i = 1 to n

3. interview candidate i

4. if candidate i is better than candidate best

5. best = i

6. hire candidate i

1

:

:I { candidate i is hired

1/ .... 1 1/ 2

( ln )

}i

i

h

n

O c n

E X iE X X

Lemma

ProofX

Assuming that the candidate are presented in a random order, algorithmHire-Assistant has an average-case totalhiring cos

t of .

1/ 3 ... 1/ln (1).

nn O

Randomized-Hire-Assistant(n)

1. randomly permute the list of candidate

2. best = 0

3. for i = 1 to n

4. interview candidate i

5. if candidate i is better than candidate best

6. best = i

7. hire candidate i

( ln )

:

h

Lemma

O c nThe expected hiring cost of the algorithmRandomed-Hire-Assista is nt .

Permute-By-Sorting(A)

1. n = A.length

2. Let P[1..n] be a new array

3. for i = 1 to n

4. P[i] = Random(1, n^3)

5. sort A, using P as sort keys

After sorting, if P[i] is the j-th smallest one, then A[i] lies in position j of the output.

Procedure Permute - By -Sorting produces a uniform random permutation of the input, assuming that all entries are distinct.

:Lemma

Define event Ei : A[i] receives the i-th smallest element.

Pr{E1∩E2 ∩…∩En-1 ∩En} =

Pr{E1} Pr{E2|E1} Pr{E3|E1 ∩E2 } … Pr{En|E1 ∩E2 ∩…∩En-1 }

Pr{E1}=1/n, Pr{E2|E1}=1/(n-1)

Pr{Ei|E1 ∩E2 ∩…∩Ei-1 } = 1/(n-i+1)

Pr{E1∩E2 ∩…∩En-1 ∩En} = 1/n!, which is the probability of obtaining the identity permutation.

It holds for any permutation.

Randomize-In-Place(A): a better method

1. n = A.length

2. for i = 1 to n

3. swap A[i] with A[Random(i, n)]

Lemma: The above procedure computes a uniform random permutation.

The birthday paradox: How many people must there be in a room before there

is a 50% chance that two of them born on the same day of the year?

(1) Suppose there are k people and there are n days in a y

ear,bi : i-th person’s birthday, i =1,…,k

Pr{bi=r}=1/n, for i =1,…,k and r=1,2,…,n

Pr{bi=r, bj=r}=Pr{bi=r}. Pr{bj=r} = 1/n2

Define event Ai : Person i’s birthday is different from per

son j’s for j < i

Pr{Bk} = Pr{Bk-1∩Ak} = Pr{Bk-1}Pr{Ak|Bk-1}where Pr{B1} = Pr{A1}=1

11

Pr Pr ,n

i j i j nrb b b r b r

1

1

: the event that people have distinct birthdayk

k ii

k k

B A k

B A

( 1)1 2

1 (1

1 1

2 1 2 1

1 2 1 3 2 11 2 1

11 2

/

Pr{ } Pr{ }Pr{ | }Pr{ }Pr{ | }Pr{ | }... Pr{ }Pr{ | }Pr{ | }...Pr{ | }1 ( )( )...( )

1 (1 )(1 )...(1 ) 1k

n n n

k k ki

k k k k

k k k k k

k kn n n kn n n

xkn n n

i n

B B A BB A B A B

B A B A B A B

e e e x e

e e

1)

2 ( 1)1 12 2 2ln( )where n k k

n

12( 1) 2 ln 2 , (1 1 (8ln 2) ) / 2

365, 23the prob.

For we have k k n k n

n k

(2) Analysis using indicator random variables For each pair (i, j) of the k people in the room, define th

e indicator r.v. Xij, for 1≤ i < j ≤ k, by

10 /

ijX I i ji jo w

person and person have the same birthday and have the same birthday

1

1 1

1 1

1 1

Pr

( 1)/

2 2

person and have the same birthday

Let

ij

nk k

iji j ik k

iji j i

k k

iji j i

E X i j

X X

E X E X

k kkE X nn

When k(k-1) ≥ 2n, the expected number of pairs of people with the same birthday is at least 1

2 1 1 82 0

2( ), 365 28, we expect to find at least

one matching pair

nk k n k

k n n k

Balls and bins problem: Randomly toss identical balls into b bins, numbered 1,2,

…,b The probability that a tossed ball lands in any given bin

is 1/b (a) How many balls fall in a given bin?

If n balls are tossed, the expected number of balls that fall in

the given bin is n/b (b) How many balls must one toss, on the average, until

a given bin contains a ball? By geometric distribution with probability 1/b

1

21 1 1 1 1

21 1 1 1 1 1

1 11 (1 )

1

1 2 (1 ) 3 (1 ) ...(1 ) (1 ) (1 ) ...

( ) 1

1b

b b b b b

b b b b b b

b

b

ee e

e e b

(c) (Coupon collector’s problem) How many balls must one toss until every bin contains at least one ball?

Want know the expected number n of tosses required to get b hits

The ith stage consists of the tosses after the (i-1)st hit until the ith hit

For each toss during the ith stage, there are i-1 bins that contain balls and b-i+1 empty bins

Thus, for each toss in the ith stage, the probability of obtaining a hit is (b-i+1)/b

Let ni be the number of tosses in the ith stage. Thus the number of tosses required to get b hits is n=∑b

i=1 ni

Each ni has a geometric distribution with probability of success (b-i+1)/b → E[ni]=b/b-i+1

111 1 1 1

(ln (1)) ( ln )

b b b bbi i b i ii i i i

E n E n E n b

b b O O b b

Streaks

Flip a fair coin n times, what is the longest streak of consecutive heads? Ans:θ(lg n)

Let Aik be the event that a streak of heads of length at least k begins with the ith coin flip

For j=0,1,2,…,n, Let Lj be the event that the longest streak of heads has Length exactly j, and let L be the length of the longest streak.

2

2 lg 1,2 lg

Pr 1/ 22 lg

Pr 2

kik

n

i n n

Ak n

A

For

0Pr

n

jjE L j L

2 lg

0,12 lg

Pr

j

n

jj n

L j nn

L

Note that the events for ,..., are disjoint.So the probability that a streak of heads of length

begins anywhere is

12 lg

2 lg 1

0 0

Pr

Pr 1. Pr 1

Thus,

while We have

n

j nj nn n

j jj j

L

L L

02 lg 1

0 2 lg2 lg 1

0 2 lg2 lg 1

0 2 lg

Pr

Pr Pr

(2 lg ) Pr Pr

2 lg Pr Pr

2 lg 1 (1/ ) (lg )

n

jjn n

j jj j nn n

j jj j nn n

j jj j n

E L j L

j L j L

n L n L

n L n L

n n n O n

We look for streaks of length s by partitioning the n flips into approximately n/s groups of s flips each.

lg

, lg

1

Pr 1 2 1

1lg

The probability is that the largest streakis

r n ri r n

r r

A n

n n nr n

:

lgThe expected length of the longest streak of heads in coin flips is

nC im

n

la

The probability that a streak of heads of length

does not begin in position i is

(lg ) / 2Take s n s s s

n

(lg ) / 2

, (lg ) / 2Pr 1 2 1n

i nA n

(lg ) / 2n 1 1 n

(lg ) / 2 / (lg ) / 21

(lg ) / 2

(lg ) / 2

(1 1 ) (1 )n

n n n

n

nn

n

n

The groups are mutually exclusive, ind. coin flips,

the prob. that every one of the groups fails to be a streak oflength is at most

1 2 / lg 11

2 / lg 1 / lg 1

(1 ) n n

nn n n n

ne O e O

(lg ) / 2 1

(lg ) / 2

Pr 1 1/n

jj n

n

L O n

Thus, the prob. that the longest streak exceeds is

WHY?

0(lg ) / 2

0 (lg ) / 2 1

(lg ) / 2 1

(lg ) / 2 1

Pr

Pr Pr

(lg ) / 2 Pr

(lg ) / 2 Pr

(lg ) / 2 1 1/ (lg )

n

jjn n

j jj j nn

jj nn

jj n

E L j L

j L j L

n L

n L

n O n n

Using indicator r.v. :

Let ik ikX I A1

1Let

n k

ikiX X

1

11 1 1 1

1 1 1 2Pr 1/ 2 k

n k

ikin k n k n k k n k

ik iki i i

E X E X

E X A

lg 1 1

1

lglg 1 lg 1 1 ( lg 1) /

21

( )

If , for some constant ,

c n c c c

c

k c n cn c n n c n c n n

E Xn n n

n

If c is large, the expected number of streaks of length clgn is very small.

Therefore, one streak of such a length is very likely to occur.

12

1 12

1 12

12( ) lg

If , then we obtain

and we expect that there will be a large number of streaksof length

nc E X n

n

:(lg )The length of the longest streak is

Conclusionn ■

The on-line hiring problem:

To hire an assistant, an employment agency sends one candidate each day. After interviewing that person you decide to either hire that person or not. The process stops when a person is hired.

What is the trade-off between minimizing the

amount of interviewing and maximizing the quality of the candidate hired?

What is the best k?

Let M(j) = max 1ij{score(i)}.

Let S be the event that the best-qualified applicant is chosen.

Let Si be the event the best-qualified applicant chosen is the i-th one interviewed.

Si are disjoint and we have Pr{S}= ji=1Pr{Si}.

If the best-qualified applicant is one of the first k, we have that Pr{Si}=0 and thus

Pr{S}= ji=k+1Pr{Si}.

Let Bi be the event that the best-qualified applicant must be in position i.

Let Oi denote the event that none of the applicants in position k+1 through i-1 are chosen

If Si happens, then Bi and Oi must both happen.

Bi and Oi are independent! Why?

Pr{Si} = Pr{Bi Oi} = Pr{Bi} Pr{Oi}.

Clearly, Pr{Bi} = 1/n.

Pr{Oi} = k/(i-1). Why???

Thus Pr{Si} = k/(n(i-1)).

i1

1

1

1

Pr{S} = Pr{S }

( 1)

1( / )

( 1)

1( / )

n

i k

n

i k

n

i k

n

i k

kn i

k ni

k ni

1

1

1

Differentiate

1 1

(ln ln ) Pr{ } (ln( 1) l

(ln ln )with respect to k.

1We have (ln ln 1) 0.

Thus / and Pr{ } 1

n( 1

/

).

.

)

1n n

k k

n

i k

k n kn

n k

dx dxx x

k kn k S n kn n

nk n e S e

i