randomized algorithms randomized algorithms cs648 lecture 6 reviewing the last 3 lectures...

35
Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern matching Preparation for the next lecture. 1

Upload: brice-mccormick

Post on 18-Dec-2015

235 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized AlgorithmsCS648

Lecture 6• Reviewing the last 3 lectures• Application of Fingerprinting Techniques

• 1-dimensional Pattern matching

• Preparation for the next lecture. 1

Page 2: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithms discussed till now

• Randomized algorithm for Approximate Median

• Randomized Quick Sort

• Frievald’s algorithm for Matrix Product Verification

• Randomized algorithm for Equality of two files

2

Randomly select a sample

Randomly permute the array

Randomly select a vector

Randomly select a prime number

Page 3: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithms

How does one go about designing a randomized algorithm ?

3

Page 4: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithms

Some random idea is required to design a randomized algorithm.

4

Page 5: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithms

An idea based on insight into the problem

Difficult/impossible to exploit the idea deterministically

A randomized algorithm

5

Randomization to materialize the idea

Page 6: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

RANDOMIZED QUICK SORT

6

Page 7: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Quick Sort

7

Elements of A arranged in Increasing order of values

𝒏 /𝟒 𝟑𝒏 /𝟒

A

… 𝒏

pivot

Page 8: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Quick Sort

Observation: There are many elements in A that are good pivot. Is it possible to select one good pivot efficiently ?

(not possible deterministically )

We select pivot element randomly uniformly.

8

A randomly selected element is a good pivot with probability

Page 9: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

RANDOMIZED ALGORITHM FOR APPROXIMATE MEDIAN

9

Page 10: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithm for Approximate median

A sample captures the essence of the original population.

10

Page 11: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Algorithm for Approximate median

Idea: Is it possible to select a small subset of elements whose median approximates the median ?

(not possible deterministically )

Median of a uniformly random sample will be approximate median.

11

A random sample captures the essence of the original population.

Page 12: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

FRIEVALD’S TECHNIQUEAPPLICATION

MATRIX PRODUCT VERIFICATION

12

Page 13: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Frievald’s Algorithm

13≟

𝑪

𝑨 𝑩

⨯ 0

0

0

0

𝒙 𝒚

𝒛𝒙

Page 14: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Frievald’s AlgorithmThe key idea

Fact: An equation has a unique solution depending upon and only.

Problem: Suppose you do not know the values of and . Your aim is to select a value for which does not satisfy the corresponding equation.

Idea: Consider any two different values {, }. Surely the equation is not satisfied for at least one of {, }. Can we select that value deterministically ?

selects a value randomly uniformly out of {, }.

14

Randomization used to exploit the idea:

Page 15: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Frievald’s Algorithm(Analyzing error probability)

15

12…𝑛

2

𝑫=(𝑨 ∙𝑩−𝑪) 𝒙

+ … + = 0

+ … + = 0

Fixing the values of , …, arbitrarily

Page 16: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

FINGERPRINTINGAPPLICATION

CRYPTOGRAPHY

16

Page 17: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

17

Aim: To determine if File A identical to File B by communicating fewest bits ?

File A File B

Page 18: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

How many primes less than ?

18

Primes less than

100 25

1000 168

10000 1229

100000 9592

1000000 78498

Page 19: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Key idea from prime

19

4𝑛2 log𝑛1

2𝑛1 𝑑Less than prime

factors of

around prime numbers in ]

Page 20: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Visualize a file as a binary number

File A = … File B = …

= =

Overview of Protocol:Let be a prime number selected randomly uniformly from []If mod = mod then conclude A=B else conclude A≠BError occurs if “is one of the prime factors of ()”

20

Page 21: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

FINGERPRINTINGAPPLICATION 3

PATTERN MATCHING

21

Page 22: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Text :Pattern :

Pattern is said to appear in Text at location if for all .

Problem: Given a Text , and a pattern , does appear anywhere in ?

Deterministic Algorithm• Trivial algorithm: O() time• Knuth-Morris-Pratt algorithm: O() timeRandomized Monte Carlo Algorithm• O() time, and error probability <

22

100101100110001101111010101110101010111010000101

011110101011101

17

Page 23: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Motivation• Simplicity, real time implementation, streaming environment • Extension to 2-dimensions

• Converting Monte Carlo to Las Vegas algorithm

23

1 1 1 0

1 1 0 1

1 0 1 1

1 1 1 1

m⨯m

n⨯nO() time algorithm

Page 24: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

RANDOMIZED ALGORITHM FOR FINGERPRINTING

24

Page 25: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Checking ifappears in Text at location

Text :Pattern :

Observation: O() time algorithm is obvious.

Question: How to do this task in O(1) time ?Answer: have a fingerprint .

Question: What properties should the fingerprint possess?• ??• ??

25

0111101110110101

𝒌

100101100110001101111010101010101010111010000101

Small size

Efficiently computable

Page 26: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Checking ifappears in Text at location

Text :Pattern :

= = Let be a prime number selected randomly uniformly from [ ] mod . mod .

If then conclude that appears at . Error occurs if “is one of the prime factors of ()”Error probability at location ≤Fingerprint has size= O() bits.

26

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Small size but Not efficiently computable

Page 27: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Checking ifappears in Text at location

Text :Pattern :

= = Question: Any relation between and ?

Question: Any relation between and ? = mod = ( ) mod = ( ) mod = ( ) mod

27

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

<

Page 28: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Fingerprint function: how good is it ?

Text :Pattern :

= mod = mod

Lemma: The fingerprint function • Occupies bits.• Computing take O() bits operations. • Error probability for any particular location is .

Question: What is the error probability of the algorithm ?

28

𝒌

100101100110001101111010101010101010111010000101

0111101110110101

Page 29: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Bounding the error probability of the algorithm

: event that the algorithm fails : event that the fingerprint shows a false match at any fixed location

Can you see some relation between and ’s ? = P() ≤

= since is the same for each .

< = .Question: How large should be to ensure P() < Answer: = () Fingerprint size: O().

29

Page 30: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Final result

Theorem: There is a Monte Carlo randomized algorithm for detecting any match of P[] in T[] that :• Fails with error probability < .• Performs O() operations involving O() bit numbers.

Homework: It is possible to convert the above algorithm to Las Vagas. Spend some time thinking over it (we shall discuss it in some class).

30

It takes O(1) time on word-RAM model of computation for an operation involving O() bit numbers. So the time complexity of the

algorithm is O()

Page 31: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Probability tool (union theorem)

Suppose there is an event defined over a probability space (,P). Aim: to get an upper bound on P().

If it is difficult to calculate P(), try to express as union of events (usually similar/same) such that• it is easy to calculate P()Then you may bound P() using the following inequality:

P() ≤

31

Page 32: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

APPLICATIONS OF THE UNION THEOREM

32

Page 33: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Used in:• Hashing• Load balancing in distributed environment

33

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Page 34: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Balls into Bins

Ball-bin Experiment: There are balls and bins. Each ball selects its bin randomly uniformly and independent of other balls and falls into it. Theorem: For the case when , prove that with very high probability, every bin has O(log ) balls.

(The proof requires Union theorem and elementary probability. We shall discuss it in the next class. Spend some time to prove it on your own.)

34

1 2 3 … i … n

1 2 3 4 5 … m-1 m

Page 35: Randomized Algorithms Randomized Algorithms CS648 Lecture 6 Reviewing the last 3 lectures Application of Fingerprinting Techniques 1-dimensional Pattern

Randomized Quick sort

Theorem: Probability that Randomized Quick sort performs more than log comparisons is less than .

Tools needed:1. Union theorem2. Probability that we get less than HEADS during tosses of a fair coin is

less than .(The proof requires Union theorem and elementary probability. We shall

discuss it in the next class. Spend some time to prove it on your own.)

35