reconfigurable accelerator for the
TRANSCRIPT
-
7/27/2019 Reconfigurable Accelerator for The
1/46
Reconfigurable Accelerator for the
Word-Matching Stage of BLASTN
Abstract
BLAST is one of the most popular sequence analysis tools used by molecular
biologists. It is designed to efficiently find similar regions between two sequences that
have biological significance. However, because the size of genomic databases is growing
rapidly, the computation time of BLAST, when performing a complete genomic database
search, is continuously increasing. Thus, there is a clear need to accelerate this process.
In this paper, we present a new approach for genomic sequence database scanning
utilizing reconfigurable field programmable gate array (FPGA)-based hardware. In order
to derive an efficient structure for BLASTN, we propose a reconfigurable architecture to
accelerate the computation of the word-matching stage. The experimental results show
that the FPGA implementation achieves a speedup around one order of magnitude
compared to the NCBI BLASTN software running on a general purpose computer.
INTRODUCTION
Scanning genomic sequence databases is a common and often repeated task in molecular
biology. The need for speeding up these searches comes from the rapid growth of these
gene banks: every year their size is scaled by a factor of 1.5 to 2. The aim of a scan
operation is to find similarities between the query sequence and a particular genome
sequence, which might indicate similar functionality from a biological point of view.
Dynamic programming-based alignment algorithms can guarantee to find all important
similarities. However, as the search space is the product of the two sequences, which
could be several billion bases in size, it is generally not feasible to use a direct
implementation. One frequently used approach to speed up this time-consumingoperation is to use heuristics in the search algorithm. One of the most widely used
sequence analysis tools to use heuristics is the basic local alignment search tool (BLAST)
[2]. Although BLASTs algorithms are highly optimized for similarity search, the ever
growing databases outpace the speed improvements that BLAST can provide on a general
-
7/27/2019 Reconfigurable Accelerator for The
2/46
purpose PC. BLASTN, a version of BLAST specifically designed for DNA sequence
searches, consists of a three-stage pipeline.
Stage 1: Word-Matching detect seeds (short exact matches of a certain length between
the query sequence and the subject sequence), the inputs to this stage are strings of DNA
bases, which typically uses the alphabet {A, C, G, T}.
Stage 2: Ungapped Extension extends each seed in both directions allowing substitutions
only and outputs the resulting high-scoring segment pairs (HSPs). An HSP [3] indicates
two sequence fragments with equal length whose alignment score meets or exceeds a
empirically set threshold (or cutoff score).
Stage 3: Gapped Extension uses the Smith-Waterman dynamic programming algorithm
to extend the HSPs allowing insertions and deletions.
The basic idea underlying a BLASTN search is filtration. Although each stage in
the BLASTN pipeline is becoming more sophisticated, the exponential increase in the
volume ofdata makes it important that measures are taken to reduce theamount of data
that needs to be processed. Filtration discards irrelevant fractions as early as possible,
thus reducing the overall computation time. Analysis of the various stages of the
BLASTN pipeline (see Table I) reveals that the word-matchingstage is the most time-
consuming part. Therefore, accelerating the computation of this stage will have the
greatest effect onthe overall performance.
EXISTING SYSTEM
BASIC LOCAL ALIGNMENT SEARCH TOOL
A new approach to rapid sequence comparison, basic local alignment search tool
(BLAST), directly approximates alignments that optimize a measure of local similarity,
the maximal segment pair (MSP) score. Recent mathematical results on the stochasticproperties of MSP scores allow an analysis of the performance of this method as well as
the statistical significance of alignments it generates. The basic algorithm is simple and
robust; it can be implemented in a number of ways and applied in a variety of contexts
including straight-forward DNA and protein sequence database searches, motif searches,
gene identification searches, and in the analysis of multiple regions of similarity in long
-
7/27/2019 Reconfigurable Accelerator for The
3/46
DNA sequences. In addition to its flexibility and tractability to mathematical analysis,
BLAST is an order of magnitude faster than existing sequence comparison tools of
comparable sensitivity.
A RECONFIGURABLE BLOOM FILTER ARCHITECTURE FOR BLASTN
Efficient seed-based filtration methods exist for scanning genomic sequence
databases. However, current solutions require a significant scan time on traditional
computer architectures. These scan time requirements are likely to become even more
severe due to the rapid growth in the size of databases. In this paper, we present a new
approach to genomic sequence database scanning using reconfigurable field-
programmable gate array (FPGA)-based hardware. To derive an efficient mapping onto
this type of architecture, we propose a reconfigurable Bloom filter architecture. Our
experimental results show that the FPGA implementation achieves an order of magnitude
speedup compared to the NCBI BLASTN software running on a general purpose
computer.
EFFICIENT HARDWARE HASHING FUNCTIONS FOR HIGH
PERFORMANCE COMPUTERS
Hashing is critical for high performance computer architecture. Hashing is used
extensively in hardware applications, such as page tables, for address translation. Bit
extraction and exclusive ORing hashing methods are two commonly used hashing
functions for hardware applications. There is no study of the performance of these
functions and no mention anywhere of the practical performance of the hashing functions
in comparison with the theoretical performance prediction of hashing schemes. In this
paper, we show that, by choosing hashing functions at random from a particular class,
called H3, of hashing functions, the analytical performance of hashing can be achieved in
practice on real-life data. Our results about the expected worst case performance of
hashing are of special significance, as they provide evidence for earlier theoretical
predictions.
AN APPROACH FOR MINIMAL PERFECT HASH
-
7/27/2019 Reconfigurable Accelerator for The
4/46
FUNCTIONS FOR VERY LARGE DATABASES
We propose a novel external memory based algorithm for constructing minimal
perfect hash functions h for huge sets of keys. For a set of n keys, our algorithm outputs h
in time O(n). The algorithm needs a small vector of one byte entries in main memory to
construct h. The evaluation of h(x) requires three memory accesses for any key x. The
description of h takes a constant number of up to 9 bits for each key, which is optimal
and close to the theoretical lower bound, i.e., around 2 bits per key. In our experiments,
we used a collection of 1 billion URLs collected from the web, each URL 64 characters
long on average. For this collection, our algorithm (i) finds a minimal perfect hash
function in approximately 3 hours using a commodity PC, (ii) needs just 5.45 megabytes
of internal memory to generate h and (iii) takes 8.1 bits per key for the description of h.
MERCURY BLAST DICTIONARIES: ANALYSIS AND PERFORMANCE
MEASUREMENT
This report describes a hashing scheme for a dictionary of short bit strings. The
scheme, which we call near-perfect hashing, was designed as part of the construction of
Mercury BLAST, an FPGA-based accelerator for the BLAST family of biosequence
comparison algorithms.
Near-perfect hashing is a heuristic variant of the well-known displacement
hashing approach to building perfect hash functions. It uses a family of hash functions
composed from linear transformations on bit vectors and lookups in small precomputed
tables, both of which are especially appropriate for implementation in hardware logic. We
show empirically that for inputs derived from genomic DNA sequences, our scheme
obtains a good tradeoff between the size of the hash table and the time required to ompute
it from a set of input strings, while generating few or no collisions between keys in the
table.
One of the building blocks of our scheme is the H_3 family of hash functions,
which are linear transformations on bit vectors. We show that the uniformity of hashing
performed with randomly chosen linear transformations depends critically on their rank,
and that randomly chosen transformations have a high probability of having the
maximum possible uniformity. A simple test is sufficient to ensure that a randomly
-
7/27/2019 Reconfigurable Accelerator for The
5/46
chosen H_3 hash function will not cause an unexpectedly large number of collisions.
Moreover, if two such functions are chosen independently at random, the second function
is unlikely to hash together two keys that were hashed together by the first.
Hashing schemes based on H_3 hash functions therefore tend to distribute their
inputs more uniformly than would be expected under a simple uniform hashing model,
and schemes using pairs of these functions are more uniform than would be assumed for
a pair of independent hash functions.
PROPOSED SYSTEM
In this paper, we propose a computationally efficient architecture to accelerate the
data processing of the word-matching stage based on field programmable gate arrays
(FPGA). FPGAs are suitable candidate platforms for high-performance computation due
to their fine-grained parallelism and pipelining capabilities.
BLOOM FILTERS
Introduction
Bloom filters [2] are compact data structures for probabilistic representation of a set in
order to support membership queries (i.e. queries that ask: Is elementXin set Y?). This
compact representation is the payoff for allowing a small rate offalse positives inmembership queries; that is, queries might incorrectly recognize an element as member
of the set.
We succinctly present Bloom filters use to date in the next section. In Section 3 we
describe Bloom filters in detail, and in Section 4 we give a hopefully precise picture ofspace/computing time/error rate tradeoffs.
Usage
Since their introduction in [2], Bloom filters have seen various uses:
-
7/27/2019 Reconfigurable Accelerator for The
6/46
Web cache sharing([3]) Collaborating Web caches use Bloom filters (dubbed cachesummaries) as compact representations for the local set of cached files. Each cache
periodically broadcasts its summary to all other members of the distributed cache.
Using all summaries received, a cache node has a (partially outdated, partially wrong)
global image about the set of files stored in the aggregated cache. The Squid Web
Proxy Cache [1] uses Cache Digests based on a similar idea.
Query filtering and routing ([4, 6, 7]) The Secure wide-area Discovery Service[6], subsystem of Ninja project [5], organizes service providers in a hierarchy. Bloom
filters are used as summaries for the set of services offered by a node. Summaries are
sent upwards in the hierarchy and aggregated. A query is a description for a specific
service, also represented as a Bloom filter. Thus, when a member node of the hierarchy
generates/receives a query, it has enough information at hand to decide where to forward
the query: downward, to one of its descendants (if a solution to the query is present in the
filter for the corresponding node), or upward, toward its parent (otherwise).
The OceanStore [7] replica location service uses a two-tiered approach: first it initiates an
inexpensive, probabilistic search (based on Bloom filters, similar to Ninja) to try and find
a replica. If this fails, the search falls-back on (expensive) deterministic algorithm (based
on Plaxton replica location algorithm). Alas, their description of the probabilistic search
algorithm is laconic. (An unpublished text [11] from members of the same group gives
some more details. But this does not seem to work well when resources are dynamic.)
Compact representation of a differential file ([9]). A differential file contains abatch of database records to be updated. For performance reasons the database is
updated only periodically (i.e., midnight) or when the differential file grows above a
certain threshold. However, in order to preserve integrity, each reference/query to the
database has to access the differential file to see if a particular record is scheduled to be
updated. To speed-up this process, with little memory and computational overhead, the
differential file is represented as a Bloom filter.
Free text searching ([10]). Basically, the set of words that appear in a text issuccinctly represented using a Bloom filter
-
7/27/2019 Reconfigurable Accelerator for The
7/46
Constructing Bloom Filters
Consider a set },...,,{ 21 naaaA of n elements. Bloom filters describe membership
information ofA using a bit vectorVof length m. For this, khash functions, khhh ,...,, 21
with }..1{: mXhi , are used as described below:
The following procedure builds an m bits Bloom filter, corresponding to a set A and
using khhh ,...,, 21 hash functions:
Procedure BloomFilter(set A, hash_functions, integer m)
returns filter
filter = allocate m bits initialized to 0
foreachai inA:
foreach hash function hj:
filter[hj(ai)] = 1
end foreach
end foreach
return filter
Therefore, if ai is member of a set A, in the resulting Bloom filter V all bits obtained
corresponding to the hashed values of ai are set to 1. Testing for membership of an
element elm is equivalent to testing that all corresponding bits ofVare set:
Procedure MembershipTest (elm, filter, hash_functions)
returns yes/no
foreach hash function hj:
-
7/27/2019 Reconfigurable Accelerator for The
8/46
iffilter[hj(elm)] != 1 return No
end foreach
return Yes
Nice features: filters can be built incrementally: as new elements are added to a set the
corresponding positions are computed through the hash functions and bits are set in the
filter. Moreover, the filter expressing the reunion of two sets is simply computed as the
bit-wise OR applied over the two corresponding Bloom filters.
Bloom Filters
the Math (this follows the description in [3])One prominent feature of Bloom filters is that there is a clear tradeoff between the size of
the filter and the rate of false positives. Observe that after inserting n keys into a filter of
size m using khash functions, the probability that a particular bit is still 0 is:
m
knkn
em
p
1
110 . (1)
(Note that we assume perfect hash functions that spread the elements of A evenly
throughout the space {1..m}. In practice, good results have been achieved using MD5
and other hash functions [10].)
Hence, the probability of a false positive (the probability that all k bits have been
previously set) is:
k
m
knk
kn
k
err em
pp
11
111 0 (2)
In (2) perr is minimized for 2lnn
mk hash functions. In practice however, only a small
number of hash functions are used. The reason is that the computational overhead of
each hash additional function is constant while the incremental benefit of adding a new
hash function decreases after a certain threshold (see Figure 1).
-
7/27/2019 Reconfigurable Accelerator for The
9/46
Figure 1: False positive rate as a function
of the number of hash functions used. The
size of the Bloom filter is 32 bits per entry
(m/n=32). In this case using 22 hash
functions minimizes the false positive rate.
Note however that adding a hash function
does not significantly decrease the error
rate when more than 10 hashes are already
used.
Figure 2: Size of Bloom filter (bits/entry)
as a function of the error rate desired.
Different lines represent different numbers
of hash keys used. Note that, for the error
rates considered, using 32 keys does not
bring significant benefits over using only 8
keys.
1.E-07
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1 4 7 10 13 16 19 22 25 28 31
Falsepositives
rate(logscale)
Number of hash functions
0
10
20
30
40
50
60
70
1.E-06 1.E-05 1.E-04 1.E-03 1.E-02 1.E-01
Bits
perentry
Error rate (log scale)
k=2
k=4
k=8
k=16
k=32
-
7/27/2019 Reconfigurable Accelerator for The
10/46
(2) is the base formula for engineering Bloom filters. It allows, for example, computing minimal
memory requirements (filter size) and number of hash functions given the maximum acceptable
false positives rate and number of elements in the set (as we detail in Figure 2).
k
perr
e
knm
ln
1ln
(bits per entry) (3)
To summarize: Bloom filters are compact data structures for probabilistic representation of a set
in order to support membership queries. The main design tradeoffs are the number of hash
functions used (driving the computational overhead), the size of the filter and the error (collision)
rate. Formula (2) is the main formula to tune parameters according to application requirements.
Compressed Bloom filters
Some applications that use Bloom filters need to communicate these filters across the network.
In this case, besides the three performance metrics we have seen so far: (1) the computational
overhead to lookup a value (related to the number of hash functions used), (2) the size of the
filter in memory, and (3) the error rate, a fourth metric can be used: the size of the filter
transmitted across the network. M. Mitzenmacher shows in [8] that compressing Bloom filters
might lead to significant bandwidth savings at the cost of higher memory requirements (larger
uncompressed filters) and some additional computation time to compress the filter that is sent
across the network. We do not detail here all theoretical and practical issues analyzed in [8].
A Bloom filter, conceived by Burton Howard Bloom in 1970 is a space-
efficient probabilistic data structure that is used to test whether an element is a member of
a set. False positive matches are possible, but false negatives are not; i.e. a query returns either
"inside set (may be wrong)" or "definitely not in set". Elements can be added to the set, but not
removed (though this can be addressed with a "counting" filter). The more elements that are
added to the set, the larger the probability of false positives.
Bloom proposed the technique for applications where the amount of source data would
require an impracticably large hash area in memory if "conventional" error-free hashing
techniques were applied. He gave the example of a hyphenation algorithm for a dictionary of
http://en.wikipedia.org/w/index.php?title=Burton_Howard_Bloom&action=edit&redlink=1http://en.wikipedia.org/wiki/Probabilistichttp://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Element_(mathematics)http://en.wikipedia.org/wiki/Set_(computer_science)http://en.wikipedia.org/wiki/Type_I_and_type_II_errorshttp://en.wikipedia.org/wiki/Type_I_and_type_II_errorshttp://en.wikipedia.org/wiki/Hyphenation_algorithmhttp://en.wikipedia.org/wiki/Hyphenation_algorithmhttp://en.wikipedia.org/wiki/Type_I_and_type_II_errorshttp://en.wikipedia.org/wiki/Type_I_and_type_II_errorshttp://en.wikipedia.org/wiki/Set_(computer_science)http://en.wikipedia.org/wiki/Element_(mathematics)http://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Probabilistichttp://en.wikipedia.org/w/index.php?title=Burton_Howard_Bloom&action=edit&redlink=1 -
7/27/2019 Reconfigurable Accelerator for The
11/46
500,000 words, of which 90% could be hyphenated by following simple rules but all the
remaining 50,000 words required expensive disk access to retrieve their specific patterns. With
unlimited core memory, an error-free hash could be used to eliminate all the unnecessary disk
access. But if core memory was insufficient, a smaller hash area could be used to eliminate most
of the unnecessary access. For example, a hash area only 15% of the error-free size would still
eliminate 85% of the disk accesses (Bloom (1970)).
More generally, fewer than 10 bits per element are required for a 1% false positive probability,
independent of the size or number of elements in the set (Bonomi et al. (2006)).
Algorithm description
An example of a Bloom filter, representing the set {x,y,z}. The colored arrows show the
positions in the bit array that each set element is mapped to. The element w is not in the set {x, y,
z}, because it hashes to one bit-array position containing 0. For this figure, m=18 and k=3.
An empty Bloom filter is a bit array ofm bits, all set to 0. There must also be kdifferent hash
functions defined, each of which maps or hashes some set element to one of the m array positions
with a uniform random distribution.
To add an element, feed it to each of the khash functions to get karray positions. Set the bits at
all these positions to 1.
To query for an element (test whether it is in the set), feed it to each of the khash functions to
get karray positions. If any of the bits at these positions are 0, the element is definitely not in the
setif it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then
either the element is in the set, or the bits have by chance been set to 1 during the insertion of
http://en.wikipedia.org/wiki/Bloom_filter#CITEREFBloom1970http://en.wikipedia.org/wiki/Bloom_filter#CITEREFBonomiMitzenmacherPanigrahySingh2006http://en.wikipedia.org/wiki/Bit_arrayhttp://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Map_(mathematics)http://en.wikipedia.org/wiki/File:Bloom_filter.svghttp://en.wikipedia.org/wiki/Map_(mathematics)http://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Bit_arrayhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFBonomiMitzenmacherPanigrahySingh2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFBloom1970 -
7/27/2019 Reconfigurable Accelerator for The
12/46
other elements, resulting in a false positive. In a simple bloom filter, there is no way to
distinguish between the two cases, but more advanced techniques can address this problem.
The requirement of designing kdifferent independent hash functions can be prohibitive for
large k. For a good hash functionwith a wide output, there should be little if any correlationbetween different bit-fields of such a hash, so this type of hash can be used to generate multiple
"different" hash functions by slicing its output into multiple bit fields. Alternatively, one can
pass kdifferent initial values (such as 0, 1, ..., k 1) to a hash function that takes an initial value;
or add (or append) these values to the key. For largerm and/ork, independence among the hash
functions can be relaxed with negligible increase in false positive rate (Dillinger & Manolios
(2004a), Kirsch & Mitzenmacher (2006)). Specifically, Dillinger & Manolios (2004b) show the
effectiveness of deriving the kindices using enhanced double hashing ortriple hashing, variants
ofdouble hashing that are effectively simple random number generators seeded with the two or
three hash values.
Removing an element from this simple Bloom filter is impossible because false negatives are not
permitted. An element maps to kbits, and although setting any one of those kbits to zero suffices
to remove the element, it also results in removing any other elements that happen to map onto
that bit. Since there is no way of determining whether any other elements have been added that
affect the bits for an element to be removed, clearing any of the bits would introduce the
possibility for false negatives.
One-time removal of an element from a Bloom filter can be simulated by having a second Bloom
filter that contains items that have been removed. However, false positives in the second filter
become false negatives in the composite filter, which may be undesirable. In this approach re-
adding a previously removed item is not possible, as one would have to remove it from the
"removed" filter.
It is often the case that all the keys are available but are expensive to enumerate (for example,
requiring many disk reads). When the false positive rate gets too high, the filter can be
regenerated; this should be a relatively rare event.
Space and time advantages
http://en.wikipedia.org/wiki/False_positivehttp://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004ahttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004ahttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFKirschMitzenmacher2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004bhttp://en.wikipedia.org/w/index.php?title=Enhanced_double_hashing&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Triple_hashing&action=edit&redlink=1http://en.wikipedia.org/wiki/Double_hashinghttp://en.wikipedia.org/wiki/Double_hashinghttp://en.wikipedia.org/w/index.php?title=Triple_hashing&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Enhanced_double_hashing&action=edit&redlink=1http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004bhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFKirschMitzenmacher2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004ahttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004ahttp://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/False_positive -
7/27/2019 Reconfigurable Accelerator for The
13/46
Bloom filter used to speed up answers in a key-value storage system. Values are stored on a disk
which has slow access times. Bloom filter decisions are much faster. However some unnecessary
disk accesses are made when the filter reports a positive (in order to weed out the false
positives). Overall answer speed is better with the Bloom filter than without the Bloom filter.
Use of a Bloom filter for this purpose, however, does increase memory usage.
While risking false positives, Bloom filters have a strong space advantage over other data
structures for representing sets, such as self-balancing binary search trees, tries, hash tables, or
simple arrays orlinked lists of the entries. Most of these require storing at least the data items
themselves, which can require anywhere from a small number of bits, for small integers, to an
arbitrary number of bits, such as for strings (tries are an exception, since they can share storage
between elements with equal prefixes). Linked structures incur an additional linear space
overhead for pointers. A Bloom filter with 1% error and an optimal value ofk, in contrast,
requires only about 9.6 bits per elementregardless of the size of the elements. This advantage
comes partly from its compactness, inherited from arrays, and partly from its probabilistic nature.
The 1% false-positive rate can be reduced by a factor of ten by adding only about 4.8 bits per
element.
However, if the number of potential values is small and many of them can be in the set, the
Bloom filter is easily surpassed by the deterministic bit array, which requires only one bit for
http://en.wikipedia.org/wiki/Self-balancing_binary_search_treehttp://en.wikipedia.org/wiki/Triehttp://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Array_data_structurehttp://en.wikipedia.org/wiki/Linked_listhttp://en.wikipedia.org/wiki/Triehttp://en.wikipedia.org/wiki/Bit_arrayhttp://en.wikipedia.org/wiki/File:Bloom_filter_speed.svghttp://en.wikipedia.org/wiki/Bit_arrayhttp://en.wikipedia.org/wiki/Triehttp://en.wikipedia.org/wiki/Linked_listhttp://en.wikipedia.org/wiki/Array_data_structurehttp://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Triehttp://en.wikipedia.org/wiki/Self-balancing_binary_search_tree -
7/27/2019 Reconfigurable Accelerator for The
14/46
each potential element. Note also that hash tables gain a space and time advantage if they begin
ignoring collisions and store only whether each bucket contains an entry; in this case, they have
effectively become Bloom filters with k= 1.[1]
Bloom filters also have the unusual property that the time needed either to add items or to checkwhether an item is in the set is a fixed constant, O(k), completely independent of the number of
items already in the set. No other constant-space set data structure has this property, but the
average access time of sparse hash tables can make them faster in practice than some Bloom
filters. In a hardware implementation, however, the Bloom filter shines because its klookups are
independent and can be parallelized.
To understand its space efficiency, it is instructive to compare the general Bloom filter with its
special case when k= 1. Ifk= 1, then in order to keep the false positive rate sufficiently low, a
small fraction of bits should be set, which means the array must be very large and contain long
runs of zeros. The information content of the array relative to its size is low. The generalized
Bloom filter (kgreater than 1) allows many more bits to be set while still maintaining a low false
positive rate; if the parameters (kand m) are chosen well, about half of the bits will be set, and
these will be apparently random, minimizing redundancy and maximizing information content.
Probability of false positives
http://en.wikipedia.org/wiki/Bloom_filter#cite_note-1http://en.wikipedia.org/wiki/Bloom_filter#cite_note-1http://en.wikipedia.org/wiki/Bloom_filter#cite_note-1http://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Information_contenthttp://en.wikipedia.org/wiki/File:Bloom_filter_fp_probability.svghttp://en.wikipedia.org/wiki/Information_contenthttp://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-1 -
7/27/2019 Reconfigurable Accelerator for The
15/46
The false positive probability as a function of number of elements in the filter and the filter
size . An optimal number of hash functions has been assumed.
Assume that a hash function selects each array position with equal probability. Ifm is the
number of bits in the array, and kis the number of hash functions, then the probability that a
certain bit is not set to 1 by a certain hash function during the insertion of an element is then
The probability that it is not set to 1 by any of the hash functions is
If we have inserted n elements, the probability that a certain bit is still 0 is
the probability that it is 1 is therefore
Now test membership of an element that is not in the set. Each of the karray positions computed
by the hash functions is 1 with a probability as above. The probability of all of them being 1,
which would cause the algorithm to erroneously claim that the element is in the set, is often
given as
This is not strictly correct as it assumes independence for the probabilities of each bit being set.However, assuming it is a close approximation we have that the probability of false positives
decreases as m (the number of bits in the array) increases, and increases as n (the number of
inserted elements) increases. For a given m and n, the value ofk(the number of hash functions)
that minimizes the probability is
http://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Hash_function -
7/27/2019 Reconfigurable Accelerator for The
16/46
which gives
The required number of bits m, given n (the number of inserted elements) and a desired false
positive probabilityp (and assuming the optimal value ofkis used) can be computed by
substituting the optimal value ofkin the probability expression above:
which can be simplified to:
This results in:
This means that for a given false positive probabilityp, the length of a Bloom filterm is
proportionate to the number of elements being filtered n.[2]
While the above formula is
asymptotic (i.e. applicable as m,n ), the agreement with finite values ofm,n is also quite
good; the false positive probability for a finite bloom filter with m bits, n elements, and khash
functions is at most
So we can use the asymptotic formula if we pay a penalty for at most half an extra element and at
most one fewer bit.[3]
Approximating the number of items in a Bloom filter
Swamidass & Baldi (2007) showed that the number of items in a bloom filter can be
approximated with the following formula,
http://en.wikipedia.org/wiki/Bloom_filter#cite_note-2http://en.wikipedia.org/wiki/Bloom_filter#cite_note-2http://en.wikipedia.org/wiki/Bloom_filter#cite_note-2http://en.wikipedia.org/wiki/Bloom_filter#cite_note-3http://en.wikipedia.org/wiki/Bloom_filter#cite_note-3http://en.wikipedia.org/wiki/Bloom_filter#cite_note-3http://en.wikipedia.org/wiki/Bloom_filter#CITEREFSwamidassBaldi2007http://en.wikipedia.org/wiki/Bloom_filter#CITEREFSwamidassBaldi2007http://en.wikipedia.org/wiki/Bloom_filter#cite_note-3http://en.wikipedia.org/wiki/Bloom_filter#cite_note-2 -
7/27/2019 Reconfigurable Accelerator for The
17/46
where is an estimate of the number of items in the filter,Nis length of the filter, kis the
number of hash functions per item, andXis the number of bits set to one.
The union and intersection of sets
Bloom filters are a way of compactly representing a set of items. It is common to try and
compute the size of the intersection or union between two sets. Bloom filters can be used to
approximate the size of the intersection and union of two sets. Swamidass & Baldi (2007)
showed that for two bloom filters of length , their counts, respectively can be estimated as
and
.
The size of their union can be estimated as
,
where is the number of bits set to one in either of the two bloom filters. And the
intersection can be estimated as
,
Using the three formulas together.
Interesting properties
Unlike a standard hash table, a Bloom filter of a fixed size can represent a set with an arbitrary
large number of elements; adding an element never fails due to the data structure "filling up."
However, the false positive rate increases steadily as elements are added until all bits in the filter
are set to 1, at which point allqueries yield a positive result.
Union and intersection of Bloom filters with the same size and set of hash functions can be
implemented with bitwise OR and AND operations, respectively. The union operation on Bloom
filters is lossless in the sense that the resulting Bloom filter is the same as the Bloom filter
created from scratch using the union of the two sets. The intersect operation satisfies a weaker
property: the false positive probability in the resulting Bloom filter is at most the false-positive
http://en.wikipedia.org/wiki/Bloom_filter#CITEREFSwamidassBaldi2007http://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Union_(set_theory)http://en.wikipedia.org/wiki/Intersection_(set_theory)http://en.wikipedia.org/wiki/Bitwise_operationhttp://en.wikipedia.org/wiki/Bitwise_operationhttp://en.wikipedia.org/wiki/Intersection_(set_theory)http://en.wikipedia.org/wiki/Union_(set_theory)http://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFSwamidassBaldi2007 -
7/27/2019 Reconfigurable Accelerator for The
18/46
probability in one of the constituent Bloom filters, but may be larger than the false positive
probability in the Bloom filter created from scratch using the intersection of the two sets. There
are also more accurate estimates of intersection and union[clarification needed]
that are not biased in
this way.[citation needed]
Some kinds ofsuperimposed code can be seen as a Bloom filter implemented with
physical edge-notched cards.
Examples
Google BigTable and Apache Cassandra use Bloom filters to reduce the disk lookups for non-
existent rows or columns. Avoiding costly disk lookups considerably increases the performance
of a database query operation.[4]
The Google Chrome web browser uses a Bloom filter to identify malicious URLs. Any URL is
first checked against a local Bloom filter and only upon a hit a full check of the URL is
performed.[5]
The Squid Web Proxy Cache uses Bloom filters forcache digests.[6]
Bitcoin uses Bloom filters to verify payments without running a full network node.[7][8]
The Venti archival storage system uses Bloom filters to detect previously stored data.[9]
The SPIN model checkeruses Bloom filters to track the reachable state space for large
verification problems.[10]
The Cascading analytics framework uses Bloomfilters to speed up asymmetric joins, where one
of the joined data sets is significantly larger than the other (often called Bloom join[11]
in the
database literature).[12]
Alternatives
Classic Bloom filters use bits of space per inserted key, where is the false
positive rate of the Bloom filter. However, the space that is strictly necessary for any data
structure playing the same role as a Bloom filter is only per key (Pagh, Pagh & Rao
2005). Hence Bloom filters use 44% more space than a hypothetical equivalent optimal data
structure. The number of hash functions used to achieve a given false positive rate is
http://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarifyhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Superimposed_codehttp://en.wikipedia.org/wiki/Edge-notched_cardhttp://en.wikipedia.org/wiki/BigTablehttp://en.wikipedia.org/wiki/Apache_Cassandrahttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-4http://en.wikipedia.org/wiki/Bloom_filter#cite_note-4http://en.wikipedia.org/wiki/Bloom_filter#cite_note-4http://en.wikipedia.org/wiki/Google_Chromehttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-5http://en.wikipedia.org/wiki/Bloom_filter#cite_note-5http://en.wikipedia.org/wiki/Bloom_filter#cite_note-5http://en.wikipedia.org/wiki/Squid_(software)http://en.wikipedia.org/wiki/World_Wide_Webhttp://en.wikipedia.org/wiki/Web_cachehttp://wiki.squid-cache.org/SquidFaq/CacheDigestshttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-Wessels172-6http://en.wikipedia.org/wiki/Bloom_filter#cite_note-Wessels172-6http://en.wikipedia.org/wiki/Bloom_filter#cite_note-Wessels172-6http://en.wikipedia.org/wiki/Bitcoinhttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-7http://en.wikipedia.org/wiki/Bloom_filter#cite_note-7http://en.wikipedia.org/wiki/Bloom_filter#cite_note-7http://en.wikipedia.org/wiki/Ventihttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-9http://en.wikipedia.org/wiki/Bloom_filter#cite_note-9http://en.wikipedia.org/wiki/Bloom_filter#cite_note-9http://en.wikipedia.org/wiki/SPIN_model_checkerhttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-10http://en.wikipedia.org/wiki/Bloom_filter#cite_note-10http://en.wikipedia.org/wiki/Bloom_filter#cite_note-10http://en.wikipedia.org/wiki/Cascadinghttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-11http://en.wikipedia.org/wiki/Bloom_filter#cite_note-11http://en.wikipedia.org/wiki/Bloom_filter#cite_note-12http://en.wikipedia.org/wiki/Bloom_filter#cite_note-12http://en.wikipedia.org/wiki/Bloom_filter#cite_note-12http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPaghPaghRao2005http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPaghPaghRao2005http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPaghPaghRao2005http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPaghPaghRao2005http://en.wikipedia.org/wiki/Bloom_filter#cite_note-12http://en.wikipedia.org/wiki/Bloom_filter#cite_note-11http://en.wikipedia.org/wiki/Cascadinghttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-10http://en.wikipedia.org/wiki/SPIN_model_checkerhttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-9http://en.wikipedia.org/wiki/Ventihttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-7http://en.wikipedia.org/wiki/Bloom_filter#cite_note-7http://en.wikipedia.org/wiki/Bitcoinhttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-Wessels172-6http://wiki.squid-cache.org/SquidFaq/CacheDigestshttp://en.wikipedia.org/wiki/Web_cachehttp://en.wikipedia.org/wiki/World_Wide_Webhttp://en.wikipedia.org/wiki/Squid_(software)http://en.wikipedia.org/wiki/Bloom_filter#cite_note-5http://en.wikipedia.org/wiki/Google_Chromehttp://en.wikipedia.org/wiki/Bloom_filter#cite_note-4http://en.wikipedia.org/wiki/Apache_Cassandrahttp://en.wikipedia.org/wiki/BigTablehttp://en.wikipedia.org/wiki/Edge-notched_cardhttp://en.wikipedia.org/wiki/Superimposed_codehttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Please_clarify -
7/27/2019 Reconfigurable Accelerator for The
19/46
proportional to which is not optimal as it has been proved that an optimal data structure
would need only a constant number of hash functions independent of the false positive rate.
Stern & Dill (1996) describe a probabilistic structure based on hash tables, hash compaction,
which Dillinger & Manolios (2004b) identify as significantly more accurate than a Bloom filter
when each is configured optimally. Dillinger and Manolios, however, point out that the
reasonable accuracy of any given Bloom filter over a wide range of numbers of additions makes
it attractive for probabilistic enumeration of state spaces of unknown size. Hash compaction is,
therefore, attractive when the number of additions can be predicted accurately; however, despite
being very fast in software, hash compaction is poorly suited for hardware because of worst-case
linear access time.
Putze, Sanders & Singler (2007) have studied some variants of Bloom filters that are either fasteror use less space than classic Bloom filters. The basic idea of the fast variant is to locate the k
hash values associated with each key into one or two blocks having the same size as processor's
memory cache blocks (usually 64 bytes). This will presumably improve performance by
reducing the number of potential memory cache misses. The proposed variants have however the
drawback of using about 32% more space than classic Bloom filters.
The space efficient variant relies on using a single hash function that generates for each key a
value in the range where is the requested false positive rate. The sequence of values
is then sorted and compressed using Golomb coding (or some other compression technique) to
occupy a space close to bits. To query the Bloom filter for a given key, it will
suffice to check if its corresponding value is stored in the Bloom filter. Decompressing the whole
Bloom filter for each query would make this variant totally unusable. To overcome this problem
the sequence of values is divided into small blocks of equal size that are compressed separately.
At query time only half a block will need to be decompressed on average. Because of
decompression overhead, this variant may be slower than classic Bloom filters but this may becompensated by the fact that a single hash function need to be computed.
Another alternative to classic Bloom filter is the one based on space efficient variants ofcuckoo
hashing. In this case once the hash table is constructed, the keys stored in the hash table are
http://en.wikipedia.org/wiki/Bloom_filter#CITEREFSternDill1996http://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/w/index.php?title=Hash_compaction&action=edit&redlink=1http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004bhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFPutzeSandersSingler2007http://en.wikipedia.org/wiki/Cache_misseshttp://en.wikipedia.org/wiki/Golomb_codinghttp://en.wikipedia.org/wiki/Cuckoo_hashinghttp://en.wikipedia.org/wiki/Cuckoo_hashinghttp://en.wikipedia.org/wiki/Cuckoo_hashinghttp://en.wikipedia.org/wiki/Cuckoo_hashinghttp://en.wikipedia.org/wiki/Golomb_codinghttp://en.wikipedia.org/wiki/Cache_misseshttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFPutzeSandersSingler2007http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDillingerManolios2004bhttp://en.wikipedia.org/w/index.php?title=Hash_compaction&action=edit&redlink=1http://en.wikipedia.org/wiki/Hash_tablehttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFSternDill1996 -
7/27/2019 Reconfigurable Accelerator for The
20/46
replaced with short signatures of the keys. Those signatures are strings of bits computed using a
hash function applied on the keys.
Extensions and applications
Counting filters
Counting filters provide a way to implement a delete operation on a Bloom filter without
recreating the filter afresh. In a counting filter the array positions (buckets) are extended from
being a single bit to being an n-bit counter. In fact, regular Bloom filters can be considered as
counting filters with a bucket size of one bit. Counting filters were introduced by Fan et al.
(1998).
The insert operation is extended to incrementthe value of the buckets and the lookup operation
checks that each of the required buckets is non-zero. The delete operation, obviously, then
consists of decrementing the value of each of the respective buckets.
Arithmetic overflow of the buckets is a problem and the buckets should be sufficiently large to
make this case rare. If it does occur then the increment and decrement operations must leave the
bucket set to the maximum possible value in order to retain the properties of a Bloom filter.
The size of counters is usually 3 or 4 bits. Hence counting Bloom filters use 3 to 4 times more
space than static Bloom filters. In theory, an optimal data structure equivalent to a counting
Bloom filter should not use more space than a static Bloom filter.
Another issue with counting filters is limited scalability. Because the counting Bloom filter table
cannot be expanded, the maximal number of keys to be stored simultaneously in the filter must
be known in advance. Once the designed capacity of the table is exceeded, the false positive rate
will grow rapidly as more keys are inserted.
Bonomi et al. (2006) introduced a data structure based on d-left hashing that is functionally
equivalent but uses approximately half as much space as counting Bloom filters. The scalabilityissue does not occur in this data structure. Once the designed capacity is exceeded, the keys
could be reinserted in a new hash table of double size.
The space efficient variant by Putze, Sanders & Singler (2007) could also be used to implement
counting filters by supporting insertions and deletions.
http://en.wikipedia.org/wiki/Bloom_filter#CITEREFFanCaoAlmeidaBroder1998http://en.wikipedia.org/wiki/Bloom_filter#CITEREFFanCaoAlmeidaBroder1998http://en.wikipedia.org/wiki/Arithmetic_overflowhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFBonomiMitzenmacherPanigrahySingh2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPutzeSandersSingler2007http://en.wikipedia.org/wiki/Bloom_filter#CITEREFPutzeSandersSingler2007http://en.wikipedia.org/wiki/Bloom_filter#CITEREFBonomiMitzenmacherPanigrahySingh2006http://en.wikipedia.org/wiki/Arithmetic_overflowhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFFanCaoAlmeidaBroder1998http://en.wikipedia.org/wiki/Bloom_filter#CITEREFFanCaoAlmeidaBroder1998 -
7/27/2019 Reconfigurable Accelerator for The
21/46
Data synchronization
Bloom filters can be used for approximate data synchronization as in Byers et al. (2004).
Counting Bloom filters can be used to approximate the number of differences between two sets
and this approach is described in Agarwal & Trachtenberg (2006).
Bloomier filters
Chazelle et al. (2004) designed a generalization of Bloom filters that could associate a value with
each element that had been inserted, implementing an associative array. Like Bloom filters, these
structures achieve a small space overhead by accepting a small probability of false positives. In
the case of "Bloomier filters", afalse positive is defined as returning a result when the key is not
in the map. The map will never return the wrong value for a key that is in the map.
Compact approximators
Boldi & Vigna (2005) proposed a lattice-based generalization of Bloom filters. A compact
approximator associates to each key an element of a lattice (the standard Bloom filters being
the case of the Boolean two-element lattice). Instead of a bit array, they have an array of lattice
elements. When adding a new association between a key and an element of the lattice, they
compute the maximum of the current contents of the karray locations associated to the key with
the lattice element. When reading the value associated to a key, they compute the minimum of
the values found in the klocations associated to the key. The resulting value approximates from
above the original value.
Stable Bloom filters
Deng & Rafiei (2006) proposed Stable Bloom filters as a variant of Bloom filters for streaming
data. The idea is that since there is no way to store the entire history of a stream (which can be
infinite), Stable Bloom filters continuously evict stale information to make room for more recent
elements. Since stale information is evicted, the Stable Bloom filter introduces false negatives,
which do not appear in traditional bloom filters. The authors show that a tight upper bound of
false positive rates is guaranteed, and the method is superior to standard bloom filters in terms of
false positive rates and time efficiency when a small space and an acceptable false positive rate
are given.
http://en.wikipedia.org/wiki/Data_synchronizationhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFByersConsidineMitzenmacherRost2004http://en.wikipedia.org/wiki/Bloom_filter#CITEREFAgarwalTrachtenberg2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFChazelleKilianRubinfeldTal2004http://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFBoldiVigna2005http://en.wikipedia.org/wiki/Lattice_(order)http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDengRafiei2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFDengRafiei2006http://en.wikipedia.org/wiki/Lattice_(order)http://en.wikipedia.org/wiki/Bloom_filter#CITEREFBoldiVigna2005http://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFChazelleKilianRubinfeldTal2004http://en.wikipedia.org/wiki/Bloom_filter#CITEREFAgarwalTrachtenberg2006http://en.wikipedia.org/wiki/Bloom_filter#CITEREFByersConsidineMitzenmacherRost2004http://en.wikipedia.org/wiki/Data_synchronization -
7/27/2019 Reconfigurable Accelerator for The
22/46
Scalable Bloom filters
Almeida et al. (2007) proposed a variant of Bloom filters that can adapt dynamically to the
number of elements stored, while assuring a minimum false positive probability. The technique
is based on sequences of standard bloom filters with increasing capacity and tighter false positive
probabilities, so as to ensure that a maximum false positive probability can be set beforehand,
regardless of the number of elements to be inserted.
Attenuated Bloom filters
An attenuated bloom filter of depth D can be viewed as an array of D normal bloom filters. In the
context of service discovery in a network, each node stores regular and attenuated bloom filters
locally. The regular or local bloom filter indicates which services are offered by the node itself.
The attenuated filter of level i indicates which services can be found on nodes that are i-hopsaway from the current node. The i-th value is constructed by taking a union of local bloom filters
for nodes i-hops away from the node.
Let's take a small network shown on the graph below as an example. Say we are searching for a
service A whose id hashes to bits 0,1, and 3 (pattern 11010). Let n1 node to be the starting point.
First, we check whether service A is offered by n1 by checking its local filter. Since the patterns
don't match, we check the attenuated bloom filter in order to determine which node should be the
next hop. We see that n2 doesn't offer service A but lies on the path to nodes that do. Hence, we
move to n2 and repeat the same procedure. We quickly find that n3 offers the service, and hence
the destination is located.
By using attenuated Bloom filters consisting of multiple layers, services at more than one hop
distance can be discovered while avoiding saturation of the Bloom filter by attenuating (shifting
out) bits set by sources further away.
http://en.wikipedia.org/wiki/Bloom_filter#CITEREFAlmeidaBaqueroPreguicaHutchison2007http://en.wikipedia.org/wiki/File:AttenuatedBloomFilter.pnghttp://en.wikipedia.org/wiki/Bloom_filter#CITEREFAlmeidaBaqueroPreguicaHutchison2007 -
7/27/2019 Reconfigurable Accelerator for The
23/46
HASH TABLE
A small phone book as a hash table
In computing, a hash table (also hash map) is a data structure used to implement an associative
array, a structure that can map keys to values. A hash table uses a hash function to compute
an index into an array ofbuckets orslots, from which the correct value can be found.
Ideally, the hash function should assign each possible key to a unique bucket, but this ideal
situation is rarely achievable in practice (unless the hash keys are fixed; i.e. new entries are never
added to the table after it is created). Instead, most hash table designs assume thathashcollisionsdifferent keys that are assigned by the hash function to the same bucketwill occur
and must be accommodated in some way.
In a well-dimensioned hash table, the average cost (number ofinstructions) for each lookup is
independent of the number of elements stored in the table. Many hash table designs also allow
arbitrary insertions and deletions of key-value pairs, at (amortized[2]
) constant average cost per
operation.[3][4]
In many situations, hash tables turn out to be more efficient than search trees or any
othertable lookup structure. For this reason, they are widely used in many kinds of
computersoftware, particularly for associative arrays, database indexing, caches, and sets.
Hashing
http://en.wikipedia.org/wiki/Computinghttp://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Unique_keyhttp://en.wikipedia.org/wiki/Value_(computer_science)http://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Instruction_(computer_science)http://en.wikipedia.org/wiki/Amortized_analysishttp://en.wikipedia.org/wiki/Amortized_analysishttp://en.wikipedia.org/wiki/Amortized_analysishttp://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Search_treehttp://en.wikipedia.org/wiki/Table_(computing)http://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Database_indexhttp://en.wikipedia.org/wiki/Cache_(computing)http://en.wikipedia.org/wiki/Set_(abstract_data_type)http://en.wikipedia.org/wiki/File:Hash_table_3_1_1_0_1_0_0_SP.svghttp://en.wikipedia.org/wiki/Set_(abstract_data_type)http://en.wikipedia.org/wiki/Cache_(computing)http://en.wikipedia.org/wiki/Database_indexhttp://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Table_(computing)http://en.wikipedia.org/wiki/Search_treehttp://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Amortized_analysishttp://en.wikipedia.org/wiki/Amortized_analysishttp://en.wikipedia.org/wiki/Instruction_(computer_science)http://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Value_(computer_science)http://en.wikipedia.org/wiki/Unique_keyhttp://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Associative_arrayhttp://en.wikipedia.org/wiki/Data_structurehttp://en.wikipedia.org/wiki/Computing -
7/27/2019 Reconfigurable Accelerator for The
24/46
Main article: Hash function
The idea of hashing is to distribute the entries (key/value pairs) across an array of buckets. Given
a key, the algorithm computes an index that suggests where the entry can be found:
index = f(key, array_size)
Often this is done in two steps:
hash = hashfunc(key)
index = hash % array_size
In this method, the hash is independent of the array size, and it is then reducedto an index (a
number between 0 and array_size 1) using the modulus operator (%).
In the case that the array size is a power of two, the remainder operation is reduced to masking,
which improves speed, but can increase problems with a poor hash function.
Choosing a good hash function
A good hash function and implementation algorithm are essential for good hash table
performance, but may be difficult to achieve.
A basic requirement is that the function should provide a uniform distribution of hash values. A
non-uniform distribution increases the number of collisions and the cost of resolving them.
Uniformity is sometimes difficult to ensure by design, but may be evaluated empirically using
statistical tests, e.g. a Pearson's chi-squared test for discrete uniform distributions[5]
[6]
The distribution needs to be uniform only for table sizes that occur in the application. In
particular, if one uses dynamic resizing with exact doubling and halving ofs, the hash function
needs to be uniform only whens is a power of two. On the other hand, some hashing algorithms
provide uniform hashes only whens is a prime number.[7]
Foropen addressing schemes, the hash function should also avoid clustering, the mapping of two
or more keys to consecutive slots. Such clustering may cause the lookup cost to skyrocket, even
http://en.wikipedia.org/wiki/Hash_functionhttp://en.wikipedia.org/wiki/Power_of_twohttp://en.wikipedia.org/wiki/Mask_(computing)http://en.wikipedia.org/wiki/Uniform_distribution_(discrete)http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Discrete_uniform_distributionhttp://en.wikipedia.org/wiki/Hash_table#cite_note-chernoff-5http://en.wikipedia.org/wiki/Hash_table#cite_note-chernoff-5http://en.wikipedia.org/wiki/Hash_table#cite_note-plackett-6http://en.wikipedia.org/wiki/Hash_table#cite_note-plackett-6http://en.wikipedia.org/wiki/Hash_table#cite_note-plackett-6http://en.wikipedia.org/wiki/Power_of_twohttp://en.wikipedia.org/wiki/Prime_numberhttp://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Open_addressinghttp://en.wikipedia.org/wiki/Open_addressinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Prime_numberhttp://en.wikipedia.org/wiki/Power_of_twohttp://en.wikipedia.org/wiki/Hash_table#cite_note-plackett-6http://en.wikipedia.org/wiki/Hash_table#cite_note-chernoff-5http://en.wikipedia.org/wiki/Pearson%27s_chi-squared_test#Discrete_uniform_distributionhttp://en.wikipedia.org/wiki/Uniform_distribution_(discrete)http://en.wikipedia.org/wiki/Mask_(computing)http://en.wikipedia.org/wiki/Power_of_twohttp://en.wikipedia.org/wiki/Hash_function -
7/27/2019 Reconfigurable Accelerator for The
25/46
if the load factor is low and collisions are infrequent. The popular multiplicative hash[3]
is
claimed to have particularly poor clustering behavior.[7]
Cryptographic hash functions are believed to provide good hash functions for any table sizes,
either by modulo reduction or by bit masking. They may also be appropriate if there is a risk ofmalicious users trying to sabotage a network service by submitting requests designed to generate
a large number of collisions in the server's hash tables. However, the risk of sabotage can also be
avoided by cheaper methods (such as applying a secret salt to the data, or using a universal hash
function).
Some authors claim that good hash functions should have the avalanche effect; that is, a single-
bit change in the input key should affect, on average, half the bits in the output. Some popular
hash functions do not have this property.[citation needed]
Perfect hash function
If all keys are known ahead of time, a perfect hash function can be used to create a perfect hash
table that has no collisions. Ifminimal perfect hashing is used, every location in the hash table
can be used as well.
Perfect hashing allows forconstant time lookups in the worst case. This is in contrast to most
chaining and open addressing methods, where the time for lookup is low on average, but may be
very large (proportional to the number of entries) for some sets of keys.
Key statistics
A critical statistic for a hash table is called the load factor. This is simply the number of entries
divided by the number of buckets, that is, n/kwhere n is the number of entries and kis the
number of buckets.
If the load factor is kept reasonable, the hash table should perform well, provided the hashing is
good. If the load factor grows too large, the hash table will become slow, or it may fail to work
(depending on the method used). The expected constant timeproperty of a hash table assumes
that the load factor is kept below some bound. For afixednumber of buckets, the time for a
lookup grows with the number of entries and so does not achieve the desired constant time.
http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3http://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Cryptographic_hash_functionhttp://en.wikipedia.org/wiki/Modulo_operationhttp://en.wikipedia.org/wiki/Mask_(computing)http://en.wikipedia.org/wiki/Denial_of_service_attackhttp://en.wikipedia.org/wiki/Salt_(cryptography)http://en.wikipedia.org/wiki/Universal_hash_functionhttp://en.wikipedia.org/wiki/Universal_hash_functionhttp://en.wikipedia.org/wiki/Avalanche_effecthttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Perfect_hash_functionhttp://en.wikipedia.org/wiki/Perfect_hash_function#Minimal_perfect_hash_functionhttp://en.wikipedia.org/wiki/Constant_timehttp://en.wikipedia.org/wiki/Constant_timehttp://en.wikipedia.org/wiki/Constant_timehttp://en.wikipedia.org/wiki/Constant_timehttp://en.wikipedia.org/wiki/Perfect_hash_function#Minimal_perfect_hash_functionhttp://en.wikipedia.org/wiki/Perfect_hash_functionhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Avalanche_effecthttp://en.wikipedia.org/wiki/Universal_hash_functionhttp://en.wikipedia.org/wiki/Universal_hash_functionhttp://en.wikipedia.org/wiki/Salt_(cryptography)http://en.wikipedia.org/wiki/Denial_of_service_attackhttp://en.wikipedia.org/wiki/Mask_(computing)http://en.wikipedia.org/wiki/Modulo_operationhttp://en.wikipedia.org/wiki/Cryptographic_hash_functionhttp://en.wikipedia.org/wiki/Hash_table#cite_note-twang1-7http://en.wikipedia.org/wiki/Hash_table#cite_note-knuth-3 -
7/27/2019 Reconfigurable Accelerator for The
26/46
Second to that, one can examine the variance of number of entries per bucket. For example, two
tables both have 1000 entries and 1000 buckets; one has exactly one entry in each bucket, the
other has all entries in the same bucket. Clearly the hashing is not working in the second one.
A low load factor is not especially beneficial. As load factor approaches 0, the proportion ofunused areas in the hash table increases, but there is not necessarily any reduction in search cost.
This results in wasted memory.
Collision resolution
Hash collisions are practically unavoidable when hashing a random subset of a large set of
possible keys. For example, if 2,500 keys are hashed into a million buckets, even with a perfectly
uniform random distribution, according to the birthday problem there is a 95% chance of at leasttwo of the keys being hashed to the same slot.
Therefore, most hash table implementations have some collision resolution strategy to handle
such events. Some common strategies are described below. All these methods require that the
keys (or pointers to them) be stored in the table, together with the associated values.
Separate chaining
http://en.wikipedia.org/wiki/Collision_(computer_science)http://en.wikipedia.org/wiki/Birthday_problemhttp://en.wikipedia.org/wiki/File:Hash_table_5_0_1_1_1_1_1_LL.svghttp://en.wikipedia.org/wiki/Birthday_problemhttp://en.wikipedia.org/wiki/Collision_(computer_science) -
7/27/2019 Reconfigurable Accelerator for The
27/46
Hash collision resolved by separate chaining.
In the method known asseparate chaining, each bucket is independent, and has some sort
oflist of entries with the same index. The time for hash table operations is the time to find the
bucket (which is constant) plus the time for the list operation. (The technique is also called open
hashingorclosed addressing.)
In a good hash table, each bucket has zero or one entries, and sometimes two or three, but rarely
more than that. Therefore, structures that are efficient in time and space for these cases are
preferred. Structures that are efficient for a fairly large number of entries are not needed or
desirable. If these cases happen often, the hashing is not working well, and this needs to be fixed.
Separate chaining with linked lists
Chained hash tables with linked lists are popular because they require only basic data structures
with simple algorithms, and can use simple hash functions that are unsuitable for other methods.
The cost of a table operation is that of scanning the entries of the selected bucket for the desired
key. If the distribution of keys is sufficiently uniform, the average cost of a lookup depends only
on the average number of keys per bucketthat is, on the load factor.
Chained hash tables remain effective even when the number of table entries n is much higher
than the number of slots. Their performance degrades more gracefully (linearly) with the load
factor. For example, a chained hash table with 1000 slots and 10,000 stored keys (load factor 10)
is five to ten times slower than a 10,000-slot table (load factor 1); but still 1000 times faster than
a plain sequential list, and possibly even faster than a balanced search tree.
For separate-chaining, the worst-case scenario is when all entries are inserted into the same
bucket, in which case the hash table is ineffective and the cost is that of searching the bucket data
structure. If the latter is a linear list, the lookup procedure may have to scan all its entries, so the
worst-case cost is proportional to the numbern of entries in the table.
The bucket chains are often implemented as ordered lists, sorted by the key field; this choice
approximately halves the average cost of unsuccessful lookups, compared to an unordered
list[citation needed]
. However, if some keys are much more likely to come up than others, an
unordered list with move-to-front heuristic may be more effective. More sophisticated data
structures, such as balanced search trees, are worth considering only if the load factor is large
http://en.wikipedia.org/wiki/List_(abstract_data_type)http://en.wikipedia.org/wiki/Linked_listhttp://en.wikipedia.org/wiki/SUHAhttp://en.wikipedia.org/wiki/Graceful_degradationhttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Move-to-front_heuristichttp://en.wikipedia.org/wiki/Move-to-front_heuristichttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Graceful_degradationhttp://en.wikipedia.org/wiki/SUHAhttp://en.wikipedia.org/wiki/Linked_listhttp://en.wikipedia.org/wiki/List_(abstract_data_type) -
7/27/2019 Reconfigurable Accelerator for The
28/46
(about 10 or more), or if the hash distribution is likely to be very non-uniform, or if one must
guarantee good performance even in a worst-case scenario. However, using a larger table and/or
a better hash function may be even more effective in those cases.
Chained hash tables also inherit the disadvantages of linked lists. When storing small keys andvalues, the space overhead of the next pointer in each entry record can be significant. An
additional disadvantage is that traversing a linked list has poorcache performance, making the
processor cache ineffective.
Separate chaining with list heads
Hash collision by separate chaining with head records in the bucket array.
Some chaining implementations store the first record of each chain in the slot array itself.[4]
The
number of pointer traversals is decreased by one for most cases. The purpose is to increase cache
efficiency of hash table access.
The disadvantage is that an empty bucket takes the same space as a bucket with one entry. To
save memory space, such hash tables often have about as many slots as stored entries, meaning
that many slots have two or more entries.
Separate chaining with other structures[edit source]
Instead of a list, one can use any other data structure that supports the required operations. For
example, by using a self-balancing tree, the theoretical worst-case time of common hash table
http://en.wikipedia.org/wiki/Locality_of_referencehttp://en.wikipedia.org/wiki/Hash_table#cite_note-cormen-4http://en.wikipedia.org/wiki/Hash_table#cite_note-cormen-4http://en.wikipedia.org/wiki/Hash_table#cite_note-cormen-4http://en.wikipedia.org/w/index.php?title=Hash_table&action=edit§ion=9http://en.wikipedia.org/wiki/Self-balancing_binary_search_treehttp://en.wikipedia.org/wiki/File:Hash_table_5_0_1_1_1_1_0_LL.svghttp://en.wikipedia.org/wiki/Self-balancing_binary_search_treehttp://en.wikipedia.org/w/index.php?title=Hash_table&action=edit§ion=9http://en.wikipedia.org/wiki/Hash_table#cite_note-cormen-4http://en.wikipedia.org/wiki/Locality_of_reference -
7/27/2019 Reconfigurable Accelerator for The
29/46
operations (insertion, deletion, lookup) can be brought down to O(log n) rather than O(n).
However, this approach is only worth the trouble and extra memory cost if long delays must be
avoided at all costs (e.g. in a real-time application), or if one must guard against many entries
hashed to the same slot (e.g. if one expects extremely non-uniform distributions, or in the case of
web sites or other publicly accessible services, which are vulnerable to malicious key
distributions in requests).
The variant called array hash table uses a dynamic array to store all the entries that hash to the
same slot. Each newly inserted entry gets appended to the end of the dynamic array that is
assigned to the slot. The dynamic array is resized in an exact-fitmanner, meaning it is grown
only by as many bytes as needed. Alternative techniques such as growing the array by block
sizes orpages were found to improve insertion performance, but at a cost in space. This variation
makes more efficient use ofCPU caching and the translation lookaside buffer(TLB), because
slot entries are stored in sequential memory positions. It also dispenses with the next pointers
that are required by linked lists, which saves space. Despite frequent array resizing, space
overheads incurred by operating system such as memory fragmentation, were found to be small.
An elaboration on this approach is the so-called dynamic perfect hashing,[11]
where a bucket that
contains kentries is organized as a perfect hash table with k2
slots. While it uses more memory
(n2
slots forn entries, in the worst case and n*kslots in the average case), this variant has
guaranteed constant worst-case lookup time, and low amortized time for insertion.
http://en.wikipedia.org/wiki/Big_O_notationhttp://en.wikipedia.org/wiki/Big_O_notationhttp://en.wikipedia.org/wiki/Big_O_notationhttp://en.wikipedia.org/w/index.php?title=Array_hash_table&action=edit&redlink=1http://en.wikipedia.org/wiki/Dynamic_arrayhttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Translation_lookaside_bufferhttp://en.wikipedia.org/wiki/Dynamic_perfect_hashinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-11http://en.wikipedia.org/wiki/Hash_table#cite_note-11http://en.wikipedia.org/wiki/Hash_table#cite_note-11http://en.wikipedia.org/wiki/Hash_table#cite_note-11http://en.wikipedia.org/wiki/Dynamic_perfect_hashinghttp://en.wikipedia.org/wiki/Translation_lookaside_bufferhttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Dynamic_arrayhttp://en.wikipedia.org/w/index.php?title=Array_hash_table&action=edit&redlink=1http://en.wikipedia.org/wiki/Big_O_notation -
7/27/2019 Reconfigurable Accelerator for The
30/46
Open addressing
Hash collision resolved by open addressing with linear probing (interval=1). Note that "Ted
Baker" has a unique hash, but nevertheless collided with "Sandra Dee", that had previously
collided with "John Smith".
In another strategy, called open addressing, all entry records are stored in the bucket array itself.
When a new entry has to be inserted, the buckets are examined, starting with the hashed-to slot
and proceeding in someprobe sequence, until an unoccupied slot is found. When searching for
an entry, the buckets are scanned in the same sequence, until either the target record is found, or
an unused array slot is found, which indicates that there is no such key in the table.[12]
The name
"open addressing" refers to the fact that the location ("address") of the item is not determined by
its hash value. (This method is also called closed hashing; it should not be confused with "open
hashing" or "closed addressing" that usually mean separate chaining.)
Well-known probe sequences include:
Linear probing, in which the interval between probes is fixed (usually 1)
http://en.wikipedia.org/wiki/Open_addressinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Linear_probinghttp://en.wikipedia.org/wiki/File:Hash_table_5_0_1_1_1_1_0_SP.svghttp://en.wikipedia.org/wiki/Linear_probinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Open_addressing -
7/27/2019 Reconfigurable Accelerator for The
31/46
Quadratic probing, in which the interval between probes is increased by adding thesuccessive outputs of a quadratic polynomial to the starting value given by the original hash
computation
Double hashing, in which the interval between probes is computed by another hash functionA drawback of all these open addressing schemes is that the number of stored entries cannot
exceed the number of slots in the bucket array. In fact, even with good hash functions, their
performance dramatically degrades when the load factor grows beyond 0.7 or so. Thus a more
aggressive resize scheme is needed. Separate linking works correctly with any load factor,
although performance is likely to be reasonable if it is kept below 2 or so. For many applications,
these restrictions mandate the use of dynamic resizing, with its attendant costs.
Open addressing schemes also put more stringent requirements on the hash function: besides
distributing the keys more uniformly over the buckets, the function must also minimize the
clustering of hash values that are consecutive in the probe order. Using separate chaining, the
only concern is that too many objects map to thesame hash value; whether they are adjacent or
nearby is completely irrelevant.
Open addressing only saves memory if the entries are small (less than four times the size of a
pointer) and the load factor is not too small. If the load factor is close to zero (that is, there are
far more buckets than stored entries), open addressing is wasteful even if each entry is just two
words.
http://en.wikipedia.org/wiki/Quadratic_probinghttp://en.wikipedia.org/wiki/Double_hashinghttp://en.wikipedia.org/wiki/File:Hash_table_average_insertion_time.pnghttp://en.wikipedia.org/wiki/Double_hashinghttp://en.wikipedia.org/wiki/Quadratic_probing -
7/27/2019 Reconfigurable Accelerator for The
32/46
This graph compares the average number of cache misses required to look up elements in tables
with chaining and linear probing. As the table passes the 80%-full mark, linear probing's
performance drastically degrades.
Open addressing avoids the time overhead of allocating each new entry record, and can be
implemented even in the absence of a memory allocator. It also avoids the extra indirection
required to access the first entry of each bucket (that is, usually the only one). It also has
betterlocality of reference, particularly with linear probing. With small record sizes, these
factors can yield better performance than chaining, particularly for lookups.
Hash tables with open addressing are also easier to serialize, because they do not use pointers.
On the other hand, normal open addressing is a poor choice for large elements, because these
elements fill entire CPU cachelines (negating the cache advantage), and a large amount of space
is wasted on large empty table slots. If the open addressing table only stores references to
elements (external storage), it uses space comparable to chaining even for large records but loses
its speed advantage.
Generally speaking, open addressing is better used for hash tables with small records that can be
stored within the table (internal storage) and fit in a cache line. They are particularly suitable for
elements of one word or less. If the table is expected to have a high load factor, the records are
large, or the data is variable-sized, chained hash tables often perform as well or better.
Ultimately, used sensibly, any kind of hash table algorithm is usually fast enough; and the
percentage of a calculation spent in hash table code is low. Memory usage is rarely considered
excessive. Therefore, in most cases the differences between these algorithms are marginal, and
other considerations typically come into play.[citation needed]
Coalesced hashing
A hybrid of chaining and open addressing, coalesced hashing links together chains of nodes
within the table itself.[12]
Like open addressing, it achieves space usage and (somewhat
diminished) cache advantages over chaining. Like chaining, it does not exhibit clustering effects;
in fact, the table can be efficiently filled to a high density. Unlike chaining, it cannot have more
elements than table slots.
http://en.wikipedia.org/wiki/Locality_of_referencehttp://en.wikipedia.org/wiki/Serializationhttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/Coalesced_hashinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Hash_table#cite_note-tenenbaum90-12http://en.wikipedia.org/wiki/Coalesced_hashinghttp://en.wikipedia.org/wiki/Wikipedia:Citation_neededhttp://en.wikipedia.org/wiki/CPU_cachehttp://en.wikipedia.org/wiki/Serializationhttp://en.wikipedia.org/wiki/Locality_of_reference -
7/27/2019 Reconfigurable Accelerator for The
33/46
Cuckoo hashing
Another alternative open-addressing solution is cuckoo hashing, which ensures constant lookup
time in the worst case, and constant amortized time for insertions and deletions. It uses two or
more hash functions, which means any key/value pair could be in two or more locations. For
lookup, the first hash function is used; if the key/value is not found, then the second hash
function is used, and so on. If a collision happens during insertion, then the key is re-hashed with
the second hash function to map it to another bucket. If all hash functions are used and there is
still a collision, then the key it collided with is removed to make space for the new key, and the
old key is re-hashed with one of the other hash functions, which maps it to another bucket. If that
location also results in a collision, then the process repeats until there is no collision or the
process traverses all the buckets, at which point the table is resized. By combining multiple hash
functions with multiple cells per bucket, very high space utilisation can be achieved.
Robin Hood hashing
One interesting variation on double-hashing collision resolution is Robin Hood hashing.[13]
The
idea is that a new key may displace a key already inserted, if its probe count is larger than that of
the key at the current position. The net effect of this is that it reduces worst case search times in
the table. This is similar to Knuth's ordered hash tables except that the criterion for bumping a
key does not depend on a direct relationship between the keys. Since both the worst case and the
variation in the number of probes is reduced dramatically, an interesting variation is to probe the
table starting at the expected successful probe value and then expand from that position in both
directions.[14]
External Robin Hashing is an extension of this algorithm where the table is stored
in an external file and each table position corresponds to a fixed-sized page or bucket
withB records.[15]
2-choice hashing
2-choice hashing employs 2 different hash functions, h1(x) and h2(x), for the hash table. Both
hash functions are used to compute two table locations. When an object is inserted in the table,
then it is placed in the table location that contains fewer objects (with the default being the h1(x)
table location if there is equality in bucket size). 2-choice hashing employs the principle of
thepower of two choices.
http://en.wikipedia.org/wiki/Cuckoo_hashinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-13http://en.wikipedia.org/wiki/Hash_table#cite_note-13http://en.wikipedia.org/wiki/Hash_table#cite_note-13http://en.wikipedia.org/wiki/Hash_table#cite_note-14http://en.wikipedia.org/wiki/Hash_table#cite_note-14http://en.wikipedia.org/wiki/Hash_table#cite_note-14http://en.wikipedia.org/wiki/Hash_table#cite_note-15http://en.wikipedia.org/wiki/Hash_table#cite_note-15http://en.wikipedia.org/wiki/Hash_table#cite_note-15http://en.wikipedia.org/wiki/2-choice_hashinghttp://en.wikipedia.org/w/index.php?title=Power_of_two_choices&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Power_of_two_choices&action=edit&redlink=1http://en.wikipedia.org/wiki/2-choice_hashinghttp://en.wikipedia.org/wiki/Hash_table#cite_note-15http://en.wikipedia.org/wiki/Hash_table#cite_note-14http://en.wikipedia.org/wiki/Hash_table#cite_note-13http://en.wikipedia.org/wiki/Cuckoo_hashing -
7/27/2019 Reconfigurable Accelerator for The
34/46
Hopscotch hashing
Another alternative open-addressing solution is hopscotch hashing,[16]
which combines the
approaches ofcuckoo hashing and linear probing, yet seems in general to avoid their limitations.
In particular it works well even when the load factor grows beyond 0.9. The algorithm is well
suited for implementing a resizable concurrent hash table.
The hopscotch hashing algorithm works by defining a neighborhood of buckets near the original
hashed bucket, where a given entry is always found. Thus, search is limited to the number of
entries in this neighborhood, which is logarithmic in the worst case, constant on average, and
with proper alignment of the neighborhood typically requires one cache miss. When inserting an
entry, one first attempts to add it to a bucket in the neighborhood. However, if all buckets in this
neighborhood are occupied, the algorithm traverses buckets in sequence until an open slot (an
unoccupied bucket) is found (as in linear probing). At that point, since the empty bucket is
outside the neighborhood, items are repeatedly displaced in a sequence of hops. (This is similar
to cuckoo hashing, but with the difference that in this case the empty slot is being moved into the
neighborhood, instead of items being moved out with the hope of eventually finding an empty
slot.) Each hop brings the open slot closer to the original neighborhood, without invalidating the