Algorithms 15 00100 v2
Algorithms 15 00100 v2
Article
Key Concepts, Weakness and Benchmark on Hash Table
Data Structures
Santiago Tapia-Fernández * , Daniel García-García and Pablo García-Hernandez
Escuela Técnica Superior de Ingeniería Informática, Universidad Politécnica de Madrid (UPM), Calle Ramiro de
Maeztu, 7, 28040 Madrid, Spain; [email protected] (D.G.-G.);
[email protected] (P.G.-H.)
* Correspondence: [email protected]
Abstract: Most computer programs or applications need fast data structures. The performance of
a data structure is necessarily influenced by the complexity of its common operations; thus, any
data-structure that exhibits a theoretical complexity of amortized constant time in several of its main
operations should draw a lot of attention. Such is the case of a family of data structures that is called
hash tables. However, what is the real efficiency of these hash tables? That is an interesting question
with no simple answer and there are some issues to be considered. Of course, there is not a unique
hash table; in fact, there are several sub-groups of hash tables, and, even more, not all programming
languages use the same variety of hash tables in their default hash table implementation, neither
they have the same interface. Nevertheless, all hash tables do have a common issue: they have to
solve hash collisions; that is a potential weakness and it also induces a classification of hash tables
according to the strategy to solve collisions. In this paper, some key concepts about hash tables are
exposed and some definitions about those key concepts are reviewed and clarified, especially in order
to study the characteristics of the main strategies to implement hash tables and how they deal with
hash collisions. Then, some benchmark cases are designed and presented to assess the performance
of hash tables. The cases have been designed to be randomized, to be self-tested, to be representative
of a real user cases, and to expose and analyze the impact of different factors over the performance
Citation: Tapia-Fernández, S.;
across different hash tables and programming languages. Then, all cases have been programmed
García-García, D.; García-Hernandez,
using C++, Java and Python and analyzed in terms of interfaces and efficiency (time and memory).
P. Key Concepts, Weakness and
Benchmark on Hash Table Data
The benchmark yields important results about the performance of these structures and its (lack of)
Structures. Algorithms 2022, 15, 100. relationship with complexity analysis.
https://doi.org/10.3390/a15030100
Keywords: data structures; hash table; hash tree; algorithm performance; complexity analysis
Academic Editor: Jesper Jansson
focus is on the implementation and their details and not on theoretical designs. We will
use the theoretical context to explain why the assessment programs do what they do and
to try to explain the performance results, but an exhaustive review of bibliography is not
intended. On summary, our methodology approach to the problem follows a black box
analysis strategy.
The hash tables will be used without seeking optimizations. That means that the
improvements or optimizations in using the hash tables will be avoid even if we know
about them. For instance, most implementations could improve their building (insertion)
step by providing the expected size, but it is avoided, since the objective is using the
data structures as if no relevant information about the program requirements is known in
advance. Even more, since the goal is to measure time and memory use, program profiling
is also avoided because it could introduce some overload (even if minimal) and because
the objective is not improving the programs but just to measure their behavior.
Finally, since there are just a few standard implementation of hash tables in the chosen
programming language, some other alternatives of hash table algorithms have been selected
and implemented to provide more data for analysis. The selectiondoes not have a specific
criteria nor does it look for any particular feature, just that the information to implement
them is widely available and is clear enough to write an implementation easily and shortly.
That is, the best algorithms were not actively sought.
complexity is O(1). Of course, such behavior is the most wanted characteristic in any
context, and thus they became the object of our study.
The alternative solutions for solving qualified collisions induce the classification of
hash table data structures. Most implementations follow one of these two alternative
strategies [9]:
• Open Addressing. When a qualified collision occurs another place is searched following
a probing strategy, for instance: linear probing, double hashing, etc. The important
feature is that at most one pair hk, vi is stored in the array (in the bucket).
• Chaining. When a qualified collision occurs, the pair is stored anyway in that array
slot. That means that buckets do not store pairs hk, vi directly, but some other data
structure capable of storing several pairs, usually linked lists, but also other data
structures such as binary trees.
Even if most implementations could be classified either as open or chaining, some of
them have their own particularities, for instance, cuckoo hashing follows an open address
strategy, but it uses two hash tables instead of one as in the other alternatives [10].
1.3. Methodology
The work for this paper was planned as a survey or perspective, not as an actual
research project, that is, no novelty was primarily intended. However, since the evaluation
of empirical performance is an interesting matter, it was decided to focus the survey on
the development of benchmark cases and the empirical evaluation of algorithms instead of
focus on a theoretical comparison or review.
The first target of the study was finding the influence of the range-hashing function. As
the default hash in Java and C++ for integer numbers is the identity function (a perfect hash
function in terms of “good uniformity”), using integers as keys leaves an open window
to look for the influence of the range-hashing function and, eventually, the influence of
qualified collisions in the hash tables.
In fact, that influence was found, but the preliminary results also show something
quite unexpected: the measured times were not constant with the data structure size. That
is, the empirical measurements does not show the theoretical complexity O(1). That finding
introduces a new target in the survey and its methodology because the goal from that
moment was to confirm that fact by providing enough evidence.
Such evidence is based on:
1. A reliable measure of elapsed times. As any single operation on the standard mapping
structures takes a very short time, it was decided to measure the time for millions of
such operations. That way, measured times have a magnitude order of seconds. Since
the absolute error in time measure should not exceed a few milliseconds, the relative
error in the total time is very low.
2. Including checking operations in all benchmark cases. That is, benchmarks were
designed as if they were unit test cases. That way, it is sure that they do what they
are supposed to do. In addition, those checks avoid compiler optimizations (some
compilers could avoid the evaluation of an expression in the source code if they detect
that there is no further dependency on the resulting value). Benchmark cases are
described in the next section.
3. Providing a wide variety of benchmark cases. That variety includes: using different
programming languages, using standard and no standard implementations and using
different value types as keys. In addition, of course, benchmark cases themselves
are different. All of them share a similar design: they all build the data structure by
inserting (a lot of) elements and they all execute a loop looking for keys. However,
they do these general steps in different ways.
4. Evaluating other data structures that should be upper or lower bounds of perfor-
mance. For instance, a binary tree could provide a lower bound since it has O(log N )
complexity, while an array could be an upper bound since it does not need doing
any hash.
Algorithms 2022, 15, 100 5 of 21
ing and inserting/removing methods have similar performance. Since strings are often
used as keys, Case D was designed to evaluate a very straightforward case of string use.
Using not-so-obvious keys is reviewed in Case E, in this case, objects that represent three-
dimension-float-number vectors are used as keys. Cases F, G are other cases of using string
as keys, case F avoids using an array of keys to access the map and random generation of
data; and case G represents cases where the pair hk, vi is bigger in terms of memory use.
Most cases are executed several times to modify the values of the parameters, includ-
ing: problem size, hash table specific implementation, and other parameters. Typically, each
case produces a little above of a hundred of data rows for C++ and Java and some dozens
of row data for Python (there are less implementations in this language). A summary of
the benchmark cases is shown at Table 1.
Apart from benchmark cases, some other well-known data structures were imple-
mented from scratch. The goal of these developments is to complete a (numerous) set of
alternative implementations, to have some reference for accepted high or low performance
and to have equivalent data structure implementations across languages. They were de-
veloped from scratch to avoid external dependencies and to be fair about comparisons
between algorithms (since all implementations were developed by the same team). Of
course, benchmark cases also use the standard implementation, but, honestly, it was not
expected to exceed their performance. These implementation are:
• The cuckoo hashing algorithm [12], being implemented in C++, Java, and Python
to compare the same algorithm in different languages. This algorithm was chosen
mainly because it is supposed to use two hash functions. In fact, it is not practical to do
that and keep using the usual design; for instance in Java, it is possible to program a
custom hashCode method but there is no other standard method to provide the second
implementation. This algorithm is implemented using two ranging-hash functions
and thus, the resulting implementation is fully compatible with the standard design.
Algorithms 2022, 15, 100 7 of 21
• Two dummy structures: an array adapter and a sequence-of-pairs adapter have been
implemented in C++ modeling the map concept. The first is an array, it uses the keys
directly as the array indexes, of course, it can only be used in cases A and B since
in those cases the keys are consecutive integers and, knowing that, it is possible to
skip the hash functions. The array adapter is expected to be the best solution (when
applicable). The second is a sequence; in fact, a std::vector of pairs hk, vi, where
pairs are searched one by one along the sequence when looking for a key. Obviously,
this is expected to be the worst solution and, in fact, is only used for small sized cases.
• A custom implementation of an Hashed Array-Mapped Trie (HAMT) [13,14]. This
data structure combines the properties of both tree-like structures and hash tables.
For such goal, likewise other hash table-like structures, it takes a hk, vi pair and
hashes its key in order to compute the location of the value within the underlying
structure, which is a custom retrievable tree, also known as trie [6], prefix trees or
digital trees. Due to its trie nature, the resulting hash is splitted into several prefixes
which will be used to traverse the tree and locate the node corresponding to the key.
In our implementation, this size is set to 6 in order to force a size of 26 = 64 bits on
several data structures used in the trie nodes whose memory size depends on the
children’s cardinality, such as bitmaps. This way these structures fit perfectly on a
64-bit computer word. A remarkable feature of the HAMT is that its nodes contain no
null pointers representing empty subtrees. This allows it to optimize memory usage
and only allocate pointers representing existing child nodes. Although the original
design suggests several tweaks to optimize performance, those have not been included
in our implementation.
• An implementation of a hash map tree in C++ to compare its performance with the
solution used in Java, and
• As far as we know, an equivalent of the algorithm used in Python, written in C++ to
compare the algorithm in fairer conditions.
All cases have been executed in 3 different computers with similar results. That is,
the results are qualitatively equal, but with some offsets due to computer specifications.
The actual results published in this paper and the software repository are taken using a
Intel(R) Core(TM) i5-8400 CPU @ 2.80GHz with 6 cores, with the following cache memory,
L1d: 192 KiB; L1i: 192 KiB; L2: 1,5 MiB; L3: 9 MiB; (from GNU/Linux command lscp),
and 31GiB of RAM memory (from GNU/Linux command free -h) running a GNU/Linux
Ubuntu. Since the total cache memory is important for later discussion, the total cache
is: 11,136 KiB (that is, 11,403 Kb), note that the GNU/Linux command time specifies the
output memory to be Kb (It might be that Kb is used as 1024 bytes, unfortunately, there is
no way to be sure).
Software versions are: Ubuntu 20.04, gcc 9.3.0, OpenJDK 17.0.1 (64 bits), and python 3.8.10.
3. Results
To analyze the results, a lot of graphics have been generated automatically and re-
viewed. Before doing so, some comments about the programming of cases will be drawn,
especially about interfaces.
well, this is quite a surprise because, in fact, they are the same methods. However, even if
methods are the same they are a few important differences in programming in them:
• There is a curious difference in the hash function results. While C++ returns a size_t
value (typically 64 bits sized), Java returns an int (32 bits) and in Python it seems to
vary around 32 to 64 bits (using sys.getsizeof). As already mentioned, C++ and
Java return the same integer when doing the hash of an integer but Python computes
something else.
• Of course, Java and Python have memory management and C++ does not. Therefore,
programming is slightly easier, but not always! If needed it is possible to use smart
pointers in C++ to provide memory managing and similar behaviors, that is, smart
pointers are somehow equivalent to Java or Python references to objects. In fact, the
implementations of the cuckoo hashing and HAMT uses smart pointers. On the contrary,
there are somethings you could do in C++ and you simply could not do it in any of
the others.
• Generic programming is quite different from one to the other. The easiest and the
most expressive but with the less boilerplate code is Python (by far) since, in fact, any
method or function behaves as a template and in addition to the “magic methods”
provide a straightforward solution to overload operators. Maybe C++ is more power-
ful due to the general features of the language, but it is quite verbose and sometimes
very difficult to check or debug. Finally, generic programming in Java is less flexible
than in the others.
• Finally, there is another curious difference in the interfaces. While Python and Java
return the old value when removing an element from the hash table, C++ does not. So
when the removed item is going to be used after removal, two operations are needed:
retrieve the element and then remove it.
3.9×104
Speed (Iterations/ms)
3.8×104
3.7×104
3.6×104
3.5×104
0 2 4 6 8 10 12 14 16
λ
std::unordered map
6,000
5,000
Speed (Iterations/ms)
4,000
3,000
2,000
0 2 4 6 8 10 12 14 16
λ
HashMap
The lowest speeds are for Python. Its curves are similar to Java, with lower perfor-
mance at even numbers. This case is also computed using higher values of λ, following
powers of 2 beginning at 16. The Figure 3 shows one of these samples: the removal speed vs.
λ for the Python standard map: the dictionary (dict class). In all programming language a
clear minimum speed could be observed based on these graphics: 64, 128 or 256 are typical
values, but they are not the same either among languages or case steps. For instance, the
minimum building speed in C++ is at 64 while for removal speed is at 256.
Algorithms 2022, 15, 100 10 of 21
4,200
Speed (Iterations/ms)
4,000
3,800
3,600
3,400
dict
2×104
Speed (Iterations/ms)
1×104
8×103
6×103
4×103
2×103
std::map (λ = 1)
std::map (λ = 32)
std::map (λ = 1024)
8×104
6×104
Speed (Iterations/ms)
2×104
2×104
std::unordered map (λ = 1)
std::unordered map (λ = 32)
std::unordered map (λ = 1024)
On the other hand, the influence of the λ parameter is far less important as the effect
of the size. Thus, it is possible to keep only one of the λ values and display some simpler
graphics in order to compare the performance between programming languages and data
structures. For instance, Figure 7 displays the performance of several options in C++ and
Java, whereas Figure 8 shows some others with less outcome, including Python dict.
Algorithms 2022, 15, 100 12 of 21
Speed (Iterations/ms)
1×105
8×104
6×104
array-adapter (λ = 1)
array-adapter (λ = 32)
array-adapter (λ = 1024)
1×105
8×104
Speed (Iterations/ms)
6×104
4×104
2×104
The interesting fact in these graphics is that C++ is not always the fastest. It is almost
the fastest in every data structure, but for instance, the implementation of cuckoo hashing
in Java beats the std::unordered_map until size ' 216 even if it is not particularly refined.
The other interesting aspect is that cuckoo hashing has an extraordinary performance. Both
implementations in Java and C++ get very high speeds, although the implementation in
Java degrades more abruptly than the rest. Even the worst, the Python dict overcomes the
implementation of the std::map at high sizes.
Algorithms 2022, 15, 100 13 of 21
2×104
Speed (Iterations/ms)
1×104
8×103
6×103
4×103
2×103
Given those remarkable results, the rest of the cases try to confirm these results,
generalize them, or figure out some explanation.
3.2.2. Case C
This case is designed to evaluate the performance of the operations of insert/removal
vs. the plain access to an element. All operations take place around the fixed size, but it is
allowed to remove up to 10 elements; also, insertions and removals are alternated randomly.
There is a parameter that holds the chance to either run an access operation or an insertion
or removal: prob, which is given the values: 0.1, 0.5 and 0.9 to compare speeds when the
predominant operations change from one to the others. When prob = 0.1 the predominant
operations are the insertion/removals.
The tendencies in this case are not very similar, some alternatives like C++ std::map
does not show a lot of differences, but hash table-based solutions tend to show a slight decrease
in speed when the insertion/removals are involved. For instance, the std::unordered_map
curves are shown in Figure 9. Since the lowest speed is at prob = 0.1, that means insertions
and removals are slightly slower than lookup operations. The results in Java are similar, but in
Python (Figure 10) it is possible to observe a clear gap between the curves with prob = 0.5 and
prob = 0.1, indicating that the difference between them is more important.
2×104
Speed (Iterations/ms)
1×104
8×103
6×103
2×103
Speed (Iterations/ms)
1.5×103
In these cases, as in previous ones, the building speeds are irregular, especially while
comparing between programming languages. Note that there is no intention to provoke
collisions: the strings are generated from numbers that all, they do not have any specific
feature to produce the same hash or ranging hash. As an example, Figure 11 shows the
building speed in Java standard maps: HashMap and Hashtable. In the implementation
done in the context of this study of cuckoo hashing, as it seems, the speed increases with
the size in the implementation of OpenJDK, probably an indication that there are some
other unknown factors with influence here. As said, cases D and G have similar behaviors.
Algorithms 2022, 15, 100 15 of 21
103
Speed (Iterations/ms)
102
101
Regarding the lookup speeds, they also show the same decrease tendency in every
data structure or language, in the standard implementation and in the others. Figure 12
shows such a tendency, being not so steep as in previous cases but still clear. In fact, this
figure is quite interesting because the curve slopes are similar in all of them. That is alright,
but the red line is the std::map whose lookup method is supposed to be O(log N ) while
the others are hash tables with O(1) complexity. Note that both axes have logarithmic
scales to emphasize the details. Without using the logarithmic scale in the horizontal axis,
the points at low values will be too close and without it in the vertical axis the differences
between hash table speed values will be minimal.
Speed (Iterations/ms)
104
103
102 103 104 105 106
size
105
Memory Used (Kb)
Cache Memory
104
Cases D and G are very similar, both use string keys, both have random float numbers
as values, and both of them compute the sum of the values. The difference is the size in
both keys and values. In case G, keys are twice the length of keys in case D and case G
values are 8 numbers instead of just one. Figure 14 displays both speed and memory use
for C++ std::unordered_map in cases D and G. Black line is the total of the cache memory
according to the computer specifications. There are two interesting points in this graphic:
1. At low size, there is hardly any difference in the speeds, but the number of operations
are not the same.
2. Case G uses more memory than case D (obviously), both of them use progressively
more memory, eventually, both get beyond the cache limit and in both cases the speeds
decrease more abruptly at those points.
Unfortunately, this analysis could not be repeated for Java because Java uses far more
memory (above the limit of cache memory) at low values of the size and so it is not possible
to find the crossing point. Even as Python uses less memory, it also uses a lot of memory and
it is not possible to attribute clearly the memory use to the data or to the Python interpreter.
3.2.4. Case E
Finally, case E introduces two characteristics in the benchmark cases: (a) it avoids
random data, (b) data are sorted to some extend. Since the keys are points in an helix and
generated by means of the parametric equations, the Z coordinates can be used to sort
Algorithms 2022, 15, 100 17 of 21
keys, so that, the keys could be considered sorted according to that criterion. In fact, that
is the sorting criterion used in C++ std::map. On the other hand, there are no default
implementations of the hash function except for Java. Therefore a very basic hash function
has been implemented in C++, it just computes the hash of the 3 float numbers by turn and
sum to the previous result and multiply by 31. While in Python, the hash is produced by
transforming the 3 floats into a tuple and them computing the default hash for tuples.
6×103
4×104 4×104
Speed(Iterations/ms)
2×104
Memory(Kb)
2×104
Cache Memory
1×104
8×103
6×103
1×104
4×103
8×103
102 103 104 105
size
There are no surprises in the results here. Of course, the std::map is favored by sorted
keys as it can be observed in Figures 15 and 16. The first shows that the std::map, that is a
tree achieves the fastest building speed, the second shows a narrower gap with respect to
the rest alternatives, it is still the slowest but the difference is not so high. On the other hand
the hash tables maintain their observed tendencies, all of them have decreasing speeds
when the size grows.
8×103
Speed (Iterations/ms)
6×103
4×103
Another significant fact can be observed in Figure 17. In this case, the fastest imple-
mentation is Java HashMap, clearly above C++ implementation. The origin of this good
performance might rely on the hash function, being the C++ hash a custom function, it is
probably worse than the standard implementation in Java. However, this is interesting
anyway since in the rest of benchmarks, C++ was always the fastest.
1×104
Speed (Iterations/ms)
8×103
6×103
4×103
2×103
1.5×103
103 104 105
size
2×104
Speed (Iterations/ms)
1×104
8×103
6×103
4×103
4. Discussion
The objective of this study was to assess how hash and ranging hash functions could
cause a weakness in the standard implementation of hash table. In fact, the results of the
case A show that weakness, although it has less importance than expected.
Algorithms 2022, 15, 100 19 of 21
Meanwhile, some unexpected results were found by varying the size, the number of
elements, that is, the N, by its common symbol in complexity analysis.As already said,
this finding reoriented the survey in order to isolate the influence of the size. One of
the main decisions was to design the benchmark cases into two separated steps: building
the map, that is the insertion of all elements, and running something else that involves
lookup operations.
The critical feature about the lookup step is that its loop has a fixed number of iterations.
In addition, it makes some kind of accumulative computation to avoid being optimized
out by compilers. That way it is possible to measure the times with high reliability, in fact,
since the elapsed times are taken before and after the loop and the number of iterations is
above hundreds thousands of times; the measure is quite deterministic and the error is very
low. Nevertheless, strictly speaking, the measured times include more operations apart
from looking up the hash table, but all of them are simple operations and none of them
have anything involving a size. Note that the number of iterations is the same for all sizes.
After these adjustments, it was clear from the facts that look-ups were not constant
time. In all languages, in all implementation, sooner or later the look-up speed drops
steeply at a size value. That is quite unexpected, all programmer manuals state one way
or the other that (average, amortized, . . . etc.) constant time is guaranteed. None of them
mention any limit for that property, but benchmark provides another evidence, that a
limit exists.
What is the cause of the behavior? Well, complexity analysis is about operations, it
counts operations, the operations could be observed and count in source code, but what
about memory use? Theoretically, memory use is evaluated in terms of memory limits,
that is, nobody care about memory use with respect to performance unless it goes above
the hardware limit (the RAM memory). Well, facts establish that memory use matters for
algorithm performance. Our last benchmark tries to identify the performance dropping
point, and there it was, that point is around the size where the data structure fills the L3
cache memory. That is something expected as cache memory is meant to do that.
So, could hash tables be blamed for dropping their performance? The answer is no.
As already said and shown in Figures 6 and 7, the performance of an array also drops at a
given size, so the fact is that no present or future hash table could achieve an empirical time
complexity of O(1) because in their actual design they use arrays (the buckets) and reading
a value from an array is not constant time, that is:
so long ago, computer performance was measured in MFLOPS, these speeds are really
high. In addition, in most benchmark cases, the loops not only call the lookup operation
on the map, but they also do some other operations (maybe simple operations, but extra
operations anyway). Considering that each look-up operation have to call the hash function,
resolve eventual collisions and so on, the speed is very high. As a consequence, using a
performance criterion in order to choose one option over the rest should be only done when
performance is the most critical issue and lots of operation of these kind will be expected.
5. Conclusions
Complexity analysis can not fully estimate computing times in modern computers.
The evidences found in the benchmark cases show clearly that prediction fails at high
values of the data structure sizes.
Since complexity analysis does not take into account the operations for memory
management, it overlooks the impact of data transmission between RAM memory and
cache memory. While a simple operation on any data is extraordinarily fast (if the data are
already in cache memory), the time for reading RAM memory is much longer. Therefore,
complexity analysis ignores operations that have a high impact on performance and counts
other operations that are negligible with respect to computing time. It is like estimating the
mass of a group of animals counting ants and missing elephants.
Indeed, an analysis that is unable to estimate its target result is hardly useful and that
is bad news. Complexity analysis has been a decisive tool to improve algorithm efficiency
and, to some extent, will be important in the future. However, at this moment, its computer
model is no longer valid. It is supposed that the computer memory is uniform and this
supposition is far from being accurate. Cache memory has been a key factor to improve
computer performance; thus, while the theoretical model only has one uniform memory,
real computers have two types of memory that are extremely different in features: an almost
unlimited, slow RAM memory and a scarce, super fast cache memory. Unfortunately, the
usual high-level programming languages have no capability to operate or manage these
types of memory.
The impact on software performance of cache memory is considerable, concepts like:
cache alignment, memory locality, cache miss, or the importance of the cache size in CPU
performance are typical topics these days. The introduction of these concepts is a clear clue
of the necessity to review the theory to estimate algorithm performance. Since memory
management has an impact in computing time, new algorithm designs should carefully
consider: memory layout, memory use, and memory alignment along with the number
of operations. The importance of these issues will be increase with data size, being a key
factor for scalability.
On the other hand, with this results, someone may argue that the predictions are
accurate at low sizes, but they are not. Along benchmarks, the C++ std::map, being a tree,
has been used as a performance reference, the complexity of its operations are reported as
being O(log n) instead of O(1), so that is the reason of being slower. Well, definitely that
does not have consistency. At low sizes, we can say at size = 64, the value of log n is so low
that it can hardly explain any difference in computing time. Therefore, again, the cause of
the speed difference should be looked for in another place. Most probably, the different
use of L1 and L2 cache memory, due to data locality, could be blamed for that, but finding
sucha cause could be the objective of another work.
It is clear that memory use is a critical issue, not to be compared to the almost unlimited
resources of the RAM memory, but with the scarce limit of (L3) cache memory. That is an
important fact since some programmers think that it is better to use memory instead of
doing operations and that, definitely, will hold for small sizes, but will not scale well at
larger sizes. For example, we can say it is possible to save one byte per data item (without
compromising alignment and other low level memory restrictions, including those related
to the use of L1 cache memory); if the number of data items will exceed one million of
Algorithms 2022, 15, 100 21 of 21
elements, do it, it will save about 1 MiB of memory at L3 cache memory and it will be
worth it.
Finally, it is our opinion that the documentation and manuals about hash tables should
change. In their actual wording, they do not mention any limits or restrictions to “constant
time”, but we think we have proven the opposite with enough evidence. Arguably, someone
might say that the documentation is correct because complexity analysis is independent
of real hardware, but, at least, this is misleading. No one reading the documentation of a
data structure is going to run his program in a theoretical computer, and, therefore, any
one using a hash table will expect the same performance at any data size.
Author Contributions: Conceptualization and methodology, S.T.-F.; software, all three authors;
validation, all three authors; writing—original draft preparation, S.T.-F.; writing—review and editing,
all three authors. All authors have read and agreed to the published version of the manuscript.
Funding: This research was partially funded by FUNDACIÓN PARA LA INNOVACIÓN INDUS-
TRIAL (FFII).
Data Availability Statement: The full results of execution times and memory use, with the reported
computer, could be found at https://bitbucket.org/b-hashtree/b-hashtree/downloads/, (accessed
on 10 February 2022), filename is CasesTSV.tar. It also contain some PDF with more figures of
benchmark cases.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design
of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or
in the decision to publish the results.
References
1. ISO/IEC. ISO International Standard ISO/IEC 14882:2011(E)—Programming Language C++; International Organization for Standard-
ization (ISO): Geneva, Switzerland, 2011.
2. Interface Map<K,V>. Available online: https://docs.oracle.com/javase/8/docs/api/java/util/Map.html (accessed on 10
February 2022).
3. Dictionaries. Available online: https://docs.python.org/3/tutorial/datastructures.html#dictionaries (accessed on 10
February 2022).
4. Cormen, T.H.; Leiserson, C.E. ; Rivest, R.L.; Stein, C. Introduction to Algorithms, 3rd ed; MIT Press: London, UK, 2009; pp. 254–306.
5. Adelson-Velsky, G.; Landis, E. An algorithm for the organization of information. Proc. Ussr Acad. Sci. 1962, 146, 1259–1263.
6. Fredkin, E. Trie Memory. Commun. ACM 1960, 9, 490–499. [CrossRef]
7. Available online: https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/hashtable.h (accessed on
10 February 2022).
8. Hash-Based Containers. Available online: https://gcc.gnu.org/onlinedocs/libstdc++/ext/pb_ds/hash_based_containers.html
(accessed on 10 February 2022).
9. Knuth, D.E. The Art of Computer Programming: Volume 3: Sorting and Searching, 2nd ed.; Addison Wesley: Boston, MA, USA, 1998;
pp. 513–559.
10. Cuckoo Hashing. Available online: https://xlinux.nist.gov/dads/HTML/cuckooHashing.html (accessed on 10 February 2022).
11. CMake Reference Documentation. Available online: https://cmake.org/cmake/help/v3.23/ (accessed on 10 February 2022).
12. Pagh, R.; Rodler, F.F. Cuckoo Hashing. In Proceedings of the Algorithms—ESA 2001: 9th Annual European Symposium, Århus,
Denmark, 28–31 August 2001; Friedhelm Meyer auf der Heide; Springer: Berlin, Germany, 2001.
13. Bagwell, P. Ideal Hash Trees; École Polytechnique Fédérale de Lausanne: Lausanne, Switzerland, 2001.
14. Bagwell P. Fast and Space Efficient Trie Searches; École Polytechnique Fédérale de Lausanne: Lausanne, Switzerland, 2000.