cs 221 guest lecture: cuckoo hashing
DESCRIPTION
CS 221 Guest lecture: Cuckoo Hashing. Shannon Larson March 11, 2011. Learning Goals. Describe the cuckoo hashing principle Analyze the space and time complexity of cuckoo hashing Apply the insert and lookup algorithms in a cuckoo hash table Construct the graph for a cuckoo table. - PowerPoint PPT PresentationTRANSCRIPT
CS 221Guest lecture: Cuckoo Hashing
Shannon LarsonMarch 11, 2011
Learning Goals
• Describe the cuckoo hashing principle• Analyze the space and time complexity of
cuckoo hashing• Apply the insert and lookup algorithms in a
cuckoo hash table• Construct the graph for a cuckoo table
Remember Graphs?
• A set of nodes • A set of edges
• Here:
Graph Cycles
• A graph cycle is a path of edges such that the first and last vertices are the same
𝑣1 ,𝑣2 ,𝑣5 ,𝑣3 ,𝑣 4 ,𝑣1
Recall Hashing
• A hash function – Takes the target – Hashes x to a bucket
• Perfect hashing is ideal:– O(1) lookup– O(1) insert
• Perfect hashing is not realistic!
Cuckoo Hashing: the idea
• Remember the cuckoo bird?– Shares a nest with other species…– …then kicks the other species out!
• Same idea with cuckoo hashing– When we insert , we “kick out” what occupies the
nest, – Then finds a new, alternate home
Why is this cool?
• Perfect hashing guarantees– O(1) lookup, O(1) insert
• Cuckoo hashing guarantees– O(1) lookup– O(1) insert**
• Other hashing strategies can’t guarantee this!
• Also, it’s an option for your final project
** There’s a caveat here, but we’ll see it later
Cuckoo Hashing: Two Nests
• Suppose we have TWO hash tables – they each have a hash function – we prefer , but if we have to move we’ll go to – if we’re in and have to move, we’ll go back to
• This is our collision strategy for cuckoo hashing– Different from linear probing/open addressing– Different from trees
Cuckoo Hashing: Example
• We want to insert • There are no conflicts anywhere
x
h1(𝑥 )
h2(𝑥 )
Cuckoo Hashing : Example
• Now we want to insert • There are no conflicts anywhere
y
x
Cuckoo Hashing : Example
• To insert , • Move to
z
x
y
oh no!
Cuckoo Hashing : Example
• Now we insert into
z
x
y
NOW we’re fine!
Cuckoo Hashing : Example
• The final table after inserting in order
x
y
z
Why two tables?
• Two tables, one for each hash function• Simple to visualize, simple to implement
• But, why two?• One table works just as well!• Just as simple to implement (all one table)
One Table Example
• Let’s insert again, with • Again, preferred
x
h1(𝑥 )
h2(𝑥 )
One Table Example
• Now insert • No conflicts, no problem
y
x
h1(𝑦 )
h2(𝑦 )
One Table Example
• Now insert • But, another conflict with :
z
x
y
oh no!h1(𝑧 )
h2(𝑧 )
One Table Example
• First, move to
z
x
y
h1(𝑧 )
h2(𝑥 )
One Table Example
• Now we move to
x
y
z
One Table Example
• Final table after inserting in order
x
y
z
Graph Representation
• How can we represent our table?
• Why not a graph?– Nodes are every possible table entry– Edges are inserted entries• This is a directed graph• Direction from current location TO alternate location
Graph Example
• Remember our one-table example?
x
y
z
1
2
3
4
1 2
3 4
Infinite Insert
• Suppose we insert something, and we end up in an infinite loop– Or, “too many” displacements– Some pre-defined maximum based on table size
Example: Loops
• Remember our one-table example?
x
y
z
1
2
3
4
1 2
3 4
Example: Loops
• Let’s insert : no conflicts still
x
y
z
1
2
3
4
1 2
3 4w
Example: Loops
• Now let’s insert : displace
x
y
z
1
2
3
4
1 2
3 4aw
Example: Loops
• Now is placed, and is displaced (put in 4)
a
y
x
1
2
3
4
1 2
3 4zw
Example: Loops
• Now is placed, and is displaced (put in 3)
a
y
x
1
2
3
4
1 2
3 4wz
Example: Loops
• Notice what happens to the graph• We keep going and going and going….
1 2
3 4
Analysis: Loops
• Remember infinite loops in a new insert?
• In the graph, this is a closed loop– We might forever re-do the same displacements
• The probability of getting a loop increases dramatically once we’ve inserted elements– N is the number of buckets (size of table)– This is from the research on cuckoo hashing
Analysis: Loops
• What can we do once we get a loop?– Rebuild, same size (ok solution)– Double table size (better solution)
• We’ll need new hash functions for both
Analysis
• Lookup has O(1) time– At MOST two places to look, ever– One location per hash function
• Insert has amortized O(1) time– Think of this as “in the long run”– In practice we see O(1) time insert– You’ll see amortized analysis in CPSC 320
• Remember the “grass and trees” analysis?
Lookup: The Code
Return the position of (either or )Otherwise, return false
lookup(x)return T[h1(x)] = x or
T[h2(x)] = x
Insert: The Code
Given a table (array) T and item to insert:insert(x)
if lookup(x)return; // if it’s already here, donepos <- h1(x); // store h1(x)for i <- 1 to M// loop at most M timesif T[pos] emptyT[pos] <- xreturn; // if T[pos] empty, doneswap x and T[pos]; // put x in T[pos]if pos = h1(x) // now we’re displacingpos <- h2(x)elsepos <- h1(x)rehash(); // if we couldn’t stop, rehashinsert(x); // then insert currently displaced
end
Analysis: Load Factor
• What is load?– The average fill factor (% full) the table is
• What about cuckoo hash tables?– For two hash functions, load factor • Remember loops?
– For three hash functions, we get • That’s pretty great, actually!
More hash functions
• What would this look like?• We would have three tables (simple case)– One hash function per table
• Or, we would have two alternates (one table)
More hash functions
• What would this look like?• Each entry has TWO alternates, not one
x
y
z
More hash functions
• When something comes in new (insert)– Put it in
• If it’s displaced, check – If that’s full, go to
• To lookup, we just look in or – Still constant time!
Even better load?
• Currently we’ve only put one item per bucket
• What if we had two cells per bucket?
x,w
y,a
z
Even better load?
• Currently we’ve only put one item per bucket
• What if we had two cells per bucket?
• What about collision strategies?– Round-robin (cells take turns swapping out)– FIFO (oldest resident gets kicked out)
Even better load?
Links & Resources
• http://en.wikipedia.org/wiki/Cuckoo_hashing• http://www.ru.is/faculty/ulfar/CuckooHash.pdf• http://
www.it-c.dk/people/pagh/papers/cuckoo-undergrad.pdf
• No neat animations on the internet…yet!– Possible personal project?– Brownie points?– Pre-coop project?