Benchmark of Major Hash Maps Implementations
Benchmark of Major Hash Maps Implementations
Tessil
This benchmark compares diferent C++ implementations of hashmaps. The main contesants are
tsl::hopscotch_map (hopscotch hashing, v1.4), tsl::robin_map (linear robin hood probing, v0.1),
tsl::sparse_map (sparse quadratic probing, v0.1),
sd::unordered_map (chaining, libsdc++
implementation, v3.4),
google::dense_hash_map (quadratic probing, v2.0) and
QHash (chaining, v4.8).
We will see how they perform in a large range of operations, both in terms of speed and memory usage.
If you jus want to know which hash map you should choose, you can skip to the las section which ofers
some recommendations depending on your use case.
For the benchmark we will use the http://incise.org/hash-table-benchmarks.html benchmark but with a few
modifcations to fx some of its shortcomings.
The glib, python and ruby hash maps were removed and other C++ hash maps were added.
We now use sd::sring as key insead of cons char * for the srings tess.
Multiple tess were added (reads misses, reads after deletes, iteration, …).
We use sd::hash<T> as hash function for all hash maps for a fair comparison.
Compiled with -O3 -march=native -DNDEBUG fags ( -march=native includes the -mpopcnt
fag on the CPU used for the benchmark, important for some hash maps implementations).
Even though they are not on this page to avoid too much jumble on the charts, other hash maps were
tesed along with diferent max load factors (which is important to take into account when comparing two
hash maps): ska::fat_hash_map (linear robin hood probing), spp::sparse_hash_map (sparse
quadratic probing),
tsl::ordered_map (linear robin hood probing with keys-values outside the bucket
array, v0.4), boos::unordered_map (chaining, v1.62),
google::sparse_hash_map (sparse quadratic
probing, v2.0),
emilib::HashMap (linear probing) and tsl::array_map (array hash table, specialized
You can fnd all these additional tess here (warning, the page is a bit heavy) with the
for srings, v0.3).
Note that even if the benchmark uses C++ implementations, the benchmark is also useful to compare
diferent collision resolution srategies in hash maps (though there may be some variations due to the
quality of the implementations).
The code of the benchmark can be found on GitHub and the raw results of the charts can be found here.
The benchmark was compiled with Clang 5.0 and ran on Linux 4.11 x64 with an Intel i5-5200u and 8 Go of
RAM. Bes of fve runs was taken.
Benchmark
Integers
For the integers tess, we use hash maps with int64_t as key and int64_t as value. The
sd::hash<int64_t> of Clang with libsdc++ used by the benchmark is an identity function (the hash of
the ‘42’ integer will return ‘42’).
Before the tes, we generate a vector with the values [0, nb_entries) and shufe this vector. Then for each
value k in the vector, we insert the key-value pair (k, 1) in the hash map.
0.8
sec. sd::unordered_map
0.7 google::dense_hash_map
sec.
QHash
0.6
tsl::sparse_map
sec.
0.5 tsl::hopscotch_map
sec. tsl::robin_map
0.4
sec.
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we generate a vector of nb_entries size where each value is randomly taken from an
uniform random number generator from all possible positive values an int64_t can hold.
Then for each
value k in the vector, we insert the key-value pair (k, 1) in the hash map.
1.2
sec. sd::unordered_map
google::dense_hash_map
1
sec. QHash
tsl::sparse_map
0.8
tsl::hopscotch_map
sec.
tsl::robin_map
0.6
sec.
0.4
sec.
0.2
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Same as the random full inserts tes but the reserve method of the hash map is called beforehand to avoid
any rehash during the insertion. It provides a fair comparison even if the growth factor of each hash map is
diferent.
0.7
sec. sd::unordered_map
0.6 google::dense_hash_map
sec. QHash
0.5 tsl::sparse_map
sec. tsl::hopscotch_map
0.4 tsl::robin_map
sec.
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random full insert tes. We then
delete each key one by one in a diferent and random order than the one they were inserted.
1.5
sec. sd::unordered_map
google::dense_hash_map
1.25
sec. QHash
tsl::sparse_map
1
tsl::hopscotch_map
sec.
tsl::robin_map
0.75
sec.
0.5
sec.
0.25
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random shufe inserts tes. We
then read each key-value pair in a diferent and random order than the one they were inserted.
0.2
sec. sd::unordered_map
google::dense_hash_map
QHash
0.15
tsl::sparse_map
sec.
tsl::hopscotch_map
tsl::robin_map
0.1
sec.
0.05
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random full inserts tes. We then
read each key-value pair in a diferent and random order than the one they were inserted.
0.35
sec. sd::unordered_map
0.3 google::dense_hash_map
sec. QHash
0.25 tsl::sparse_map
sec. tsl::hopscotch_map
0.2 tsl::robin_map
sec.
0.15
sec.
0.1
sec.
0.05
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random full inserts tes.
We then
generate another vector of nb_entries random elements diferent from the inserted elements and we try to
search for these unknown elements in the hash map.
0.45
sec. sd::unordered_map
0.4
google::dense_hash_map
sec.
0.35 QHash
sec. tsl::sparse_map
0.3
tsl::hopscotch_map
sec.
0.25 tsl::robin_map
sec.
0.2
sec.
0.15
sec.
0.1
sec.
0.05
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random full inserts tes before
deleting half of these values randomly. We then try to read all the original values in a diferent order which
will lead to 50% hits and 50% misses.
0.35
sec. sd::unordered_map
0.3 google::dense_hash_map
sec. QHash
0.25 tsl::sparse_map
sec. tsl::hopscotch_map
0.2 tsl::robin_map
sec.
0.15
sec.
0.1
sec.
0.05
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the tes, we insert nb_entries elements in the same way as in the random full inserts tes. We then
use the hash map iterators to read all the key-value pairs.
0.35
sec. sd::unordered_map
0.3 google::dense_hash_map
sec. QHash
0.25 tsl::sparse_map
sec. tsl::hopscotch_map
0.2 tsl::robin_map
sec.
0.15
sec.
0.1
sec.
0.05
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Before the random full inserts benchmark fnishes, we measure the memory that the hash map is using.
238MiB
sd::unordered_map
google::dense_hash_map
190MiB QHash
tsl::sparse_map
tsl::hopscotch_map
143MiB
tsl::robin_map
95MiB
47MiB
0MiB
500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
Small srings
For the small sring tess, we use hash maps with sd::sring as key and int64_t as value.
Each sring is a random generated sring of 15 alphanumeric characters (+1 for the null terminator). A
generated key may look like “ju1AOoeWT3LdJxL”. The generated sring doesn’t need any extra heap
allocation as Clang 5.0 (with libsdc++) will use the small sring optimization for any sring smaller or equal
to 16 characters. This allows hash maps using open addressing to potentially avoid cache-misses on
srings comparisons.
For each entry in the range [0, nb_entries), we generate a sring as key and insert it with the value 1.
1.5
sec. sd::unordered_map
google::dense_hash_map
1.25
sec. QHash
tsl::sparse_map
1
tsl::hopscotch_map
sec.
tsl::robin_map
0.75 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.5
sec.
0.25
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Same as the inserts tes but the reserve method of the hash map is called beforehand to avoid any rehash
during the insertion. It provides a fair comparison even if the growth factor of each hash map is diferent.
1
sec. sd::unordered_map
google::dense_hash_map
0.8 QHash
sec.
tsl::sparse_map
tsl::hopscotch_map
0.6
tsl::robin_map
sec.
tsl::hopscotch_map (with StoreHash)
0.2
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the hash map as in the inserts tes. We then delete each
key one by one in a diferent and random order than the one they were inserted.
1.75
sec. sd::unordered_map
1.5 google::dense_hash_map
sec. QHash
1.25 tsl::sparse_map
sec. tsl::hopscotch_map
1 tsl::robin_map
sec.
tsl::hopscotch_map (with StoreHash)
0.75 tsl::robin_map (with StoreHash)
sec.
0.5
sec.
0.25
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the hash map as in the inserts tes. We then read each
key-value pair in a diferent and random order than the one they were inserted.
0.8
sec. sd::unordered_map
0.7 google::dense_hash_map
sec.
QHash
0.6
tsl::sparse_map
sec.
0.5 tsl::hopscotch_map
sec. tsl::robin_map
0.4 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the same way as in the inserts tes. We then generate
nb_entries srings diferent from the inserted elements and we try to search for these unknown elements in
0.8
sec. sd::unordered_map
0.7 google::dense_hash_map
sec.
QHash
0.6
tsl::sparse_map
sec.
0.5 tsl::hopscotch_map
sec. tsl::robin_map
0.4 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the same way as in the inserts tes before deleting half of
these values randomly. We then try to read all the original values in a diferent order which will lead to 50%
hits and 50% misses.
0.8
sec. sd::unordered_map
0.7 google::dense_hash_map
sec.
QHash
0.6
tsl::sparse_map
sec.
0.5 tsl::hopscotch_map
sec. tsl::robin_map
0.4 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the inserts benchmark fnishes, we measure the memory that the hash map is using.
429MiB
sd::unordered_map
381MiB
google::dense_hash_map
333MiB QHash
tsl::sparse_map
286MiB
tsl::hopscotch_map
238MiB tsl::robin_map
tsl::hopscotch_map (with StoreHash)
190MiB
tsl::robin_map (with StoreHash)
143MiB
95MiB
47MiB
0MiB
500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Strings
For the srings tess, we use hash maps with sd::sring as key and int64_t as value.
Each sring is a random generated sring of 50 alphanumeric characters (+1 for the null terminator). A
generated key may look like “nv46iTRp7ur6UMbdgEkCHpoq7Qx7UU9Ta0u1ETdAvUb4LG6Xu6”. The
generated sring is long enough so that Clang can’t use the small sring optimization and has to sore it in a
heap allocated area. Each sring has also the same length so that each comparison will go through a trip to
a heap allocated area (with its potential cache-miss).
The goal of the tes is to see how the hash maps behave when comparing keys is slow.
For each entry in the range [0, nb_entries), we generate a sring as key and insert it with the value 1.
2.5
sec. sd::unordered_map
google::dense_hash_map
2 QHash
sec.
tsl::sparse_map
tsl::hopscotch_map
1.5
tsl::robin_map
sec.
tsl::hopscotch_map (with StoreHash)
0.5
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Same as the inserts tes but the reserve method of the hash map is called beforehand to avoid any rehash
during the insertion. It provides a fair comparison even if the growth factor of each hash map is diferent.
1.4
sec. sd::unordered_map
1.2 google::dense_hash_map
sec. QHash
1 tsl::sparse_map
sec. tsl::hopscotch_map
0.8 tsl::robin_map
sec.
tsl::hopscotch_map (with StoreHash)
0.6 tsl::robin_map (with StoreHash)
sec.
0.4
sec.
0.2
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the hash map as in the inserts tes. We then delete each
key one by one in a diferent and random order than the one they were inserted.
2
sec. sd::unordered_map
google::dense_hash_map
QHash
1.5
tsl::sparse_map
sec.
tsl::hopscotch_map
tsl::robin_map
1 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.5
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the hash map as in the inserts tes. We then read each
key-value pair in a diferent and random order than the one they were inserted.
1.2
sec. sd::unordered_map
google::dense_hash_map
1
sec. QHash
tsl::sparse_map
0.8
tsl::hopscotch_map
sec.
tsl::robin_map
0.6 tsl::hopscotch_map (with StoreHash)
sec.
tsl::robin_map (with StoreHash)
0.4
sec.
0.2
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the same way as in the inserts tes. We then generate
nb_entries srings diferent from the inserted elements and we try to search for these unknown elements in
the hash map.
0.9
sec. sd::unordered_map
0.8
google::dense_hash_map
sec.
0.7 QHash
sec. tsl::sparse_map
0.6
tsl::hopscotch_map
sec.
0.5 tsl::robin_map
sec. tsl::hopscotch_map (with StoreHash)
0.4
tsl::robin_map (with StoreHash)
sec.
0.3
sec.
0.2
sec.
0.1
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the tes, we insert nb_entries elements in the same way as in the inserts tes before deleting half of
these values randomly. We then try to read all the original values in a diferent order which will lead to 50%
hits and 50% misses.
1
sec. sd::unordered_map
google::dense_hash_map
0.8 QHash
sec.
tsl::sparse_map
tsl::hopscotch_map
0.6
tsl::robin_map
sec.
tsl::hopscotch_map (with StoreHash)
0.2
sec.
0
sec. 500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Before the inserts benchmark fnishes, we measure the memory that the hash map is using.
667MiB
sd::unordered_map
572MiB google::dense_hash_map
QHash
476MiB tsl::sparse_map
tsl::hopscotch_map
381MiB tsl::robin_map
tsl::hopscotch_map (with StoreHash)
190MiB
95MiB
0MiB
500k 1.00M 1.50M 2.00M 2.50M 3.00M
number of entries in hash table
✔
sd::unordered_map ✔
google::dense_hash_map
✔
QHash ✔
tsl::sparse_map
✔
tsl::hopscotch_map ✔
tsl::robin_map
✔
tsl::hopscotch_map (with StoreHash) ✔
tsl::robin_map (with StoreHash)
Analysis
We can see that the hash maps using open addressing provide an advantageous alternative to chaining
due to there cache-friendliness. On the integers and small srings read tess, mos of them are able to fnd
the key while only loading one or two cache lines which make a signifcant diference. On insert, they can
also avoid a lot of allocations compared to hash maps using chaining which have to allocate the memory
for a node at each insert (a cusom allocator could improve things).
In the srings tess, we can see that soring the hash alongside the values can ofer a huge boos on
insertions, as we don’t have to recalculate the hash on rehashes, and on lookups, as we only compare two
srings when the sored hashes are equal avoiding expensive comparisons. Note that tsl::robin_map
automatically sores the hash and uses it on rehashes (but not on lookups without an explicit StoreHash )
if it can detect that it will not take more memory to do so due to alignment. It explains why the srings
inserts tes is so much faser even without the StoreHash parameter.
Regarding the load factor, mos open addressing schemes get bad results when the load factor is higher
than 0.5, even with robin hood probing (see the additional tess). Only tsl::hopscotch_map is able to
cope well with a high load factor like 0.9 without loosing too much in lookup speed ofering a really good
compromise between speed and memory usage.
In the benchmark, we are using the Clang implementation of sd::hash as hash function in all our tess.
This implementation of the hash jus uses the identity function, some other hash functions may give better
results on some hash maps implementations (notably emilib::HashMap and
google::sparse_hash_map which have terrible results on the random shufe integers inserts tes). A
more robus hash function could be tesed. A poor hash function could be tesed as well to check how each
hash map is able to cope with a bad hash disribution.
The benchmark was exclusively oriented toward hash maps. Better sructures like tries could be used to
map srings to values, but sd::sring is a familiar example to tes bigger keys than int64_t and may
incur a cache-miss on comparison if big enough due to its memory indirection.
In conclusion, even though sd::unordered_map is a good implementation, it may be worth to check the
alternatives if you need better performances or if your hash map is using too much memory.
Each hash map has its advantages and inconveniences so it may be difcult to pick-up the right one. Here
are some general recommendations depending on your use case.
By default. Before choosing a hash map, jus try out sd::unordered_map . Even though it is not the
fases hash map out there due to the cache-unfriendliness of chaining, the sandard hash map jus works
well in mos cases. External libraries are an extra maintenance cos and if you are not doing a whole lot of
operations on the hash map, sd::unordered_map will do jus fne.
For speed efciency. A hash map using an open addressing scheme should be your choice and I would
recommend either hopscotch hashing with tsl::hopscotch_map or linear robin hood hashing with
tsl::robin_map or ska::fat_hash_map .
Both have quite similar lookup speed at low load factor but tsl::hopscotch_map has the main
advantage of being able to cope much better with a high load factor (> 0.6) providing a better compromise
between speed and memory usage.
The main drawback of hopscotch hashing is that it can sufer quite a bit of clusering in the neighborhood of
a bucket which may cause extensive rehashes. When soring the hash with the StoreHash template
parameter, it also needs to reduce the size of the neighborhood which may deepens the previous problem.
But this should not be a problem with a good hash function.
On the other hand, tsl::robin_map can sore the hash at no extra cos in mos cases and will
automatically do so when these cases are detected to speed up the rehash process. As the map only need
a few bytes in the bucket for bookkeeping, it uses the res of the space left due to memory alignment to
sore part of the hash. The tsl::robin_map also ofers a faser insertion speed than
tsl::hopscotch_map and is able to cope better with a poor hash function.
Quadratic probing with google::dense_hash_map may also be a good candidate but can’t cope well
with a high load factor thus needing more memory. It also do quite poorly on reads misses. Linear probing
with emilib::HashMap sufers from the same problems.
So in the end I would recommend to try out both tsl::hopscotch_map and tsl::robin_map and see
which one works the bes for your use case. As a basic guideline, prefer tsl::hopscotch_map if you
don’t want to use too much memory and tsl::robin_map if speed is what mainly matters.
For memory efciency. If you are soring small objects (< 32 bytes) with a trivial key comparator,
tsl::sparse_map should be your go to hash map. Even though it is quite slow on insertions, it ofers a
good balance between lookup speed and memory usage, even at low load factor. It is also faser than both
google::sparse_hash_map and spp::sparse_hash_map while providing more functionalities.
When dealing with larger objects with a non-trivial key comparator, tsl::sparse_map will do fne too, but
you may also want to try tsl::ordered_map even if you don’t need the order of insertion to be kept. It
can grow the map quite fas as it never needs to move the keys-values outside of deletions and provides
good performances on lookups while keeping a low memory usage. For smaller objects with a trivial key
comparator, it is only as good as sd::unordered_map for lookups.
For srings as key. If you are using srings as key, the above recommendations sill hold true but you may
also want to try tsl::array_map . It ofers one of the bes lookup speed on large srings while having the
lowes memory usage. The main drawback is that the rehash process is slow and will need some spare
memory to copy the srings from the old map to the new map (it can’t use sd::move as the other hash
maps using sd::sring as key). But if you know the number of items beforehand, you can call the
reserve function to avoid the problem.
If you need an even more compact way to sore the srings, you may also consider a trie, notably
tsl::htrie_map , even if you don’t need to do any prefx search. The HAT-trie provides a really memory
efcient way to sore the srings without losing too much on lookup speed.
For large objects. When dealing with large objects which take time to copy or move around, using open
addressing is not a good idea. On insertion the values may have to be moved around either because it is
part of the insertion process (hopscotch hashing, robin hood hashing, cuckoo hashing, …) or due to a
rehash. Bes to sick to sd::unordered_map which can jus moves pointers to nodes around or
eventually tsl::ordered_map which only needs to move one element on deletion.
In the end these are some basic advices based on a benchmark using some artifcial use cases with a
specifc compiler. The bes is sill to pick-up some candidates and tes them with your code in your
environment.
Load comments
Load (Disqus)
comments (Disqus)
Tessil