Computer Science > Data Structures and Algorithms
[Submitted on 13 May 2016]
Title:Robin Hood Hashing really has constant average search cost and variance in full tables
View PDFAbstract:Thirty years ago, the Robin Hood collision resolution strategy was introduced for open addressing hash tables, and a recurrence equation was found for the distribution of its search cost. Although this recurrence could not be solved analytically, it allowed for numerical computations that, remarkably, suggested that the variance of the search cost approached a value of $1.883$ when the table was full. Furthermore, by using a non-standard mean-centered search algorithm, this would imply that searches could be performed in expected constant time even in a full table.
In spite of the time elapsed since these observations were made, no progress has been made in proving them. In this paper we introduce a technique to work around the intractability of the recurrence equation by solving instead an associated differential equation. While this does not provide an exact solution, it is sufficiently powerful to prove a bound for the variance, and thus obtain a proof that the variance of Robin Hood is bounded by a small constant for load factors arbitrarily close to 1. As a corollary, this proves that the mean-centered search algorithm runs in expected constant time.
We also use this technique to study the performance of Robin Hood hash tables under a long sequence of insertions and deletions, where deletions are implemented by marking elements as deleted. We prove that, in this case, the variance is bounded by $1/(1-\alpha)+O(1)$, where $\alpha$ is the load factor.
To model the behavior of these hash tables, we use a unified approach that can be applied also to study the First-Come-First-Served and Last-Come-First-Served collision resolution disciplines, both with and without deletions.
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.