lecture 6-cs648 randomized algorithms

35
Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern matching Preparation for the next lecture. 1

Upload: anshul-yadav

Post on 26-Jun-2015

152 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Lecture 6-cs648 Randomized Algorithms

Randomized AlgorithmsCS648

Lecture 6• Reviewing the last 3 lectures• Application of Fingerprinting Techniques

• 1-dimensional Pattern matching

• Preparation for the next lecture. 1

Page 2: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithms discussed till now

• Randomized algorithm for Approximate Median

• Randomized Quick Sort

• Frievald’s algorithm for Matrix Product Verification

• Randomized algorithm for Equality of two files

2

Randomly select a sample

Randomly permute the array

Randomly select a vector

Randomly select a prime number

Page 3: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithms

How does one go about designing a randomized algorithm ?

3

Page 4: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithms

Some random idea is required to design a randomized algorithm.

4

Page 5: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithms

An idea based on insight into the problem

Difficult/impossible to exploit the idea deterministically

A randomized algorithm

5

Randomization to materialize the idea

Page 6: Lecture 6-cs648 Randomized Algorithms

RANDOMIZED QUICK SORT

6

Page 7: Lecture 6-cs648 Randomized Algorithms

Randomized Quick Sort

7

Elements of A arranged in Increasing order of values

𝒏 /𝟒 𝟑𝒏 /𝟒

A

… 𝒏

pivot

Page 8: Lecture 6-cs648 Randomized Algorithms

Randomized Quick Sort

Observation: There are many elements in A that are good pivot. Is it possible to select one good pivot efficiently ?

(not possible deterministically )

We select pivot element randomly uniformly.

8

A randomly selected element is a good pivot with probability

Page 9: Lecture 6-cs648 Randomized Algorithms

RANDOMIZED ALGORITHM FOR APPROXIMATE MEDIAN

9

Page 10: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithm for Approximate median

A sample captures the essence of the original population.

10

Page 11: Lecture 6-cs648 Randomized Algorithms

Randomized Algorithm for Approximate median

Idea: Is it possible to select a small subset of elements whose median approximates the median ?

(not possible deterministically )

Median of a uniformly random sample will be approximate median.

11

A random sample captures the essence of the original population.

Page 12: Lecture 6-cs648 Randomized Algorithms

FRIEVALD’S TECHNIQUEAPPLICATION

MATRIX PRODUCT VERIFICATION

12

Page 13: Lecture 6-cs648 Randomized Algorithms

Frievald’s Algorithm

13≟

𝑪

𝑨 𝑩

⨯ 0

0

0

0

𝒙 𝒚

𝒛𝒙

Page 14: Lecture 6-cs648 Randomized Algorithms

Frievald’s AlgorithmThe key idea

Fact: An equation has a unique solution depending upon and only.

Problem: Suppose you do not know the values of and . Your aim is to select a value for which does not satisfy the corresponding equation.

Idea: Consider any two different values {, }. Surely the equation is not satisfied for at least one of {, }. Can we select that value deterministically ?

selects a value randomly uniformly out of {, }.

14

Randomization used to exploit the idea:

Page 15: Lecture 6-cs648 Randomized Algorithms

Frievald’s Algorithm(Analyzing error probability)

15

12…𝑛

2

𝑫=(𝑨 ∙𝑩−𝑪) 𝒙

+ … + = 0

+ … + = 0

Fixing the values of , …, arbitrarily

Page 16: Lecture 6-cs648 Randomized Algorithms

FINGERPRINTINGAPPLICATION

CRYPTOGRAPHY

16

Page 17: Lecture 6-cs648 Randomized Algorithms

17

Aim: To determine if File A identical to File B by communicating fewest bits ?

File A File B

Page 18: Lecture 6-cs648 Randomized Algorithms

How many primes less than ?

18

Primes less than

100 25

1000 168

10000 1229

100000 9592

1000000 78498

Page 19: Lecture 6-cs648 Randomized Algorithms

Key idea from prime

19

4𝑛2 log𝑛1

2𝑛1 𝑑Less than prime

factors of

around prime numbers in ]

Page 20: Lecture 6-cs648 Randomized Algorithms

Visualize a file as a binary number

File A = … File B = …

= =

Overview of Protocol:Let be a prime number selected randomly uniformly from []If mod = mod then conclude A=B else conclude A≠BError occurs if “is one of the prime factors of ()”

20

Page 21: Lecture 6-cs648 Randomized Algorithms

FINGERPRINTINGAPPLICATION 3

PATTERN MATCHING

21

Page 22: Lecture 6-cs648 Randomized Algorithms

Text :Pattern :

Pattern is said to appear in Text at location if for all .

Problem: Given a Text , and a pattern , does appear anywhere in ?

Deterministic Algorithm• Trivial algorithm: O() time• Knuth-Morris-Pratt algorithm: O() timeRandomized Monte Carlo Algorithm• O() time, and error probability <

22

100101100110001101111010101110101010111010000101

011110101011101

17

Page 23: Lecture 6-cs648 Randomized Algorithms

Motivation• Simplicity, real time implementation, streaming environment • Extension to 2-dimensions

• Converting Monte Carlo to Las Vegas algorithm

23

1 1 1 0

1 1 0 1

1 0 1 1

1 1 1 1

m⨯m

n⨯nO() time algorithm

Page 24: Lecture 6-cs648 Randomized Algorithms

RANDOMIZED ALGORITHM FOR FINGERPRINTING

24

Page 25: Lecture 6-cs648 Randomized Algorithms

Checking ifappears in Text at location

Text :Pattern :

Observation: O() time algorithm is obvious.

Question: How to do this task in O(1) time ?Answer: have a fingerprint .

Question: What properties should the fingerprint possess?• ??• ??

25

0111101110110101

𝒌

100101100110001101111010101010101010111010000101

Small size

Efficiently computable

Page 26: Lecture 6-cs648 Randomized Algorithms

Checking ifappears in Text at location

Text :Pattern :

= = Let be a prime number selected randomly uniformly from [ ] mod . mod .

If then conclude that appears at . Error occurs if “is one of the prime factors of ()”Error probability at location ≤Fingerprint has size= O() bits.

26

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Small size but Not efficiently computable

Page 27: Lecture 6-cs648 Randomized Algorithms

Checking ifappears in Text at location

Text :Pattern :

= = Question: Any relation between and ?

Question: Any relation between and ? = mod = ( ) mod = ( ) mod = ( ) mod

27

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

<

Page 28: Lecture 6-cs648 Randomized Algorithms

Fingerprint function: how good is it ?

Text :Pattern :

= mod = mod

Lemma: The fingerprint function • Occupies bits.• Computing take O() bits operations. • Error probability for any particular location is .

Question: What is the error probability of the algorithm ?

28

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Page 29: Lecture 6-cs648 Randomized Algorithms

Bounding the error probability of the algorithm

: event that the algorithm fails : event that the fingerprint shows a false match at any fixed location

Can you see some relation between and ’s ? = P() ≤

= since is the same for each .

< = .Question: How large should be to ensure P() < Answer: = () Fingerprint size: O().

29

Page 30: Lecture 6-cs648 Randomized Algorithms

Final result

Theorem: There is a Monte Carlo randomized algorithm for detecting any match of P[] in T[] that :• Fails with error probability < .• Performs O() operations involving O() bit numbers.

Homework: It is possible to convert the above algorithm to Las Vagas. Spend some time thinking over it (we shall discuss it in some class).

30

It takes O(1) time on word-RAM model of computation for an operation involving O() bit numbers. So the time complexity of the

algorithm is O()

Page 31: Lecture 6-cs648 Randomized Algorithms

Probability tool (union theorem)

Suppose there is an event defined over a probability space (,P). Aim: to get an upper bound on P().

If it is difficult to calculate P(), try to express as union of events (usually similar/same) such that• it is easy to calculate P()Then you may bound P() using the following inequality:

P() ≤

31

Page 32: Lecture 6-cs648 Randomized Algorithms

APPLICATIONS OF THE UNION THEOREM

32

Page 33: Lecture 6-cs648 Randomized Algorithms

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Used in:• Hashing• Load balancing in distributed environment

33

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Page 34: Lecture 6-cs648 Randomized Algorithms

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Theorem: For the case when , prove that with very high probability, every bin has O(log ) balls.

(The proof requires Union theorem and elementary probability. We shall discuss it in the next class. Spend some time to prove it on your own.)

34

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Page 35: Lecture 6-cs648 Randomized Algorithms

Randomized Quick sort

Theorem: Probability that Randomized Quick sort performs more than log comparisons is less than .

Tools needed:1. Union theorem2. Probability that we get less than HEADS during tosses of a fair coin is

less than .(The proof requires Union theorem and elementary probability. We shall

discuss it in the next class. Spend some time to prove it on your own.)

35