lecture 12-cs648-2013 randomized algorithms

31
Randomized Algorithms CS648 Lecture 12 Hashing - II 1

Upload: anshul-yadav

Post on 10-Jun-2015

69 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Lecture 12-cs648-2013 Randomized Algorithms

Randomized AlgorithmsCS648

Lecture 12Hashing - II

1

Page 2: Lecture 12-cs648-2013 Randomized Algorithms

RECAP OF LAST LECTURE

Page 3: Lecture 12-cs648-2013 Randomized Algorithms

Problem Definition• called universe• and • Examples: ,

Aim Given a set , build a data structure storing s.t. we can answer in O(1) time :

“Does ?” for any given .

Page 4: Lecture 12-cs648-2013 Randomized Algorithms

Hashing• Hash table: : an array of size .• Hash function : Answering a Query: “Does ?” 1. ;2. Search the list stored at .

Properties of :• computable in O(1) time. • Space required by : O(1).

0 1

𝑻

How many bits needed to encode ?

Elements of

Page 5: Lecture 12-cs648-2013 Randomized Algorithms

CollisionDefinition: Two elements are said to collide under hash function if

Worst case time complexity of searching an item : No. of elements in colliding with .

0 1

𝑻

Page 6: Lecture 12-cs648-2013 Randomized Algorithms

Universal Hash Family

Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,

This definition appears strange in the beginning! But we shall soon see that there is a very natural way to arrive at this definition.

Page 7: Lecture 12-cs648-2013 Randomized Algorithms

Perfect hashing using O() space

Let be Universal Hash Family. Let : the number of collisions for when ? Question: What is ?

Page 8: Lecture 12-cs648-2013 Randomized Algorithms

Perfect hashing using O() space

Let be Universal Hash Family. Let : the number of collisions for when ? Lemma1:Lemma2:For , there will be no collision with probability at least .

Algorithm1: Perfect hashing for Fix ;Repeat1. Pick ;2. the number of collisions for under .Until .Build the hash table.

Theorem: A perfect hash function can be computed for in expected O() time.

Page 9: Lecture 12-cs648-2013 Randomized Algorithms

HASHING WITH OPTIMAL SPACE AND WORST CASE O(1) SEARCH TIME

Page 10: Lecture 12-cs648-2013 Randomized Algorithms

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1:.

Question: What is ] when = ?

Answer: .

Page 11: Lecture 12-cs648-2013 Randomized Algorithms

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Page 12: Lecture 12-cs648-2013 Randomized Algorithms

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Page 13: Lecture 12-cs648-2013 Randomized Algorithms

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

0 1

𝑻

Page 14: Lecture 12-cs648-2013 Randomized Algorithms

Optimal space hashing with worst case O(1) search time

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .Algorithm:Fix ;Repeat1. Pick ;2. no. of collisions for under ;Until ;Build the hash table; //primary hash table

For each If size of list > 1 1. Build a perfect hash table for list ; 2. Make point to this hash table;

𝑻 0 1

Page 15: Lecture 12-cs648-2013 Randomized Algorithms

be Universal Hash Family. : no. of collisions for when ? Lemma1: when .

Let : number of elements in []Extra Space required: = = +

𝑻𝑻

0 1 2

. .

.

0 1 2

. .

.

Is there any relation between and ’s?

Page 16: Lecture 12-cs648-2013 Randomized Algorithms

Theorem: A given set can be preprocessed in expected O() time to build a data structure (2-level hash table) of O() size such that any search query can be answer in worst case O(1) time.

Page 17: Lecture 12-cs648-2013 Randomized Algorithms

WHY SUCH A DEFINITION FOR UNIVERSAL HASH FAMILY ?

Page 18: Lecture 12-cs648-2013 Randomized Algorithms

Why does hashing work so well in Practice ?

A simple hash function: .• works so well in practice because the set is usually a uniformly random

subset of . As a result

• It is easy to fool this hash function such that it achieves O(s) search time.

This makes us think:“Can we achieve expected O(1) search time for any given set .”

similar question while Quick Sort Randomized Quick Sort

Page 19: Lecture 12-cs648-2013 Randomized Algorithms

Universal Hash Family

A simple hash function: .

Definition: A collection of hash-functions is said to be universal if there exists a constant such that for any ,

Page 20: Lecture 12-cs648-2013 Randomized Algorithms

A SIMPLE AND COMPACT UNIVERSAL HASH FAMILY

Page 21: Lecture 12-cs648-2013 Randomized Algorithms

The starting point

The simple hash function: .

Problem: Two elements in are bound to collide if divides || .

Is there some operation which when applied over any distributes || randomly

uniformly over [0,1,…, ] ?

Page 22: Lecture 12-cs648-2013 Randomized Algorithms

mod operation : a non-negative integer : a positive integer mod {0,1,…,}.

Question: How is | mod | related to ||mod ?Consider some Examples: • | 55 mod 31 43 mod 31 | = ?? and | 55 43| mod 31 = ??

• | 91 mod 31 102 mod 31 | = ?? and | 91 102| mod 31 = ??

Answer: Let = || mod . Then | mod | = ??

12 12

20 11

{, }

Page 23: Lecture 12-cs648-2013 Randomized Algorithms

mod operation : a prime number: {}Consider any .Question: What can we say about set = { } ?Example: , .

1 2 3 4 5 6

mod 3 6 2 5 1 4

Page 24: Lecture 12-cs648-2013 Randomized Algorithms

mod operation : a prime number: {}Consider any .Question: What can we say about set = { } ?Example: , .

Fact: = for all .Proof: = divides divides divides or divides

1 2 3 4 5 6

mod

mod 3 6 2 5 1 44 1 5 2 6 3

Not possible

Page 25: Lecture 12-cs648-2013 Randomized Algorithms

mod operation : a prime number: {}Consider any .Define set = { } ?Fact: = for all .

Question: If , then what can we say about ?Answer: distributed randomly uniformly over .

Can you now see, that the above answer plays the key role in formulating the hash function ?

Page 26: Lecture 12-cs648-2013 Randomized Algorithms

Good fact: An element is mapped to a random element in {}.

Slightly bad fact :Once element is mapped to a location, the mapping of is no more random.

So it is not clear whether| - | is mapped uniformly randomly over {0,…, }.…So let us see () a bit more closely…

12

.

.

.

𝑖

𝑖𝑥𝐦𝐨𝐝𝑝

𝑖+Δ

Page 27: Lecture 12-cs648-2013 Randomized Algorithms

Probability of collision between and

Let

and will collide under if |mod mod | is divisible by .

Question: What is relation between |mod mod | and mod ?

Answer: |mod mod | is either mod or .

Page 28: Lecture 12-cs648-2013 Randomized Algorithms

Probability of collision between and

Let Lemma: If and collide under , then either mod is divisible by or is divisible by .

{mod | } = ??

Let .Probability of collision between and = P(mod is divisible of or is divisible by ) 2 P(mod is divisible of )=

{,…, }Students must

realize that it is a necessary condition

and not sufficient condition for

collision. To get an idea, study the

example given at the last slide of this

lecture.

Page 29: Lecture 12-cs648-2013 Randomized Algorithms

Theorem: Let , then H={| } is universal.

Page 30: Lecture 12-cs648-2013 Randomized Algorithms

Example

, .Observe that =1Question: How many collisions between nd ?Answer: two (for =3,4).Here for =4.And for =3

Answer: No collisions! (although for here.)

1 2 3 4 5 6

2 4 6 1 3 5

3 6 2 5 1 4

4 1 5 2 6 3

5 3 1 6 4 2

6 5 4 3 2 1

1 2 3 4 5 6

123456

Table storing

Page 31: Lecture 12-cs648-2013 Randomized Algorithms

Homework:

Let , Then prove that H={| } is universal. In particular, show that for any ,

Hence it is slightly better than the hash family discussed just now.