Hashing refers to the process of generating a small sized output (that can be used as index in a table) from an input of typically large and variable size. Hashing uses mathematical formulas known as hash functions to do the transformation. This technique determines an index or location for the storage of an item in a data structure called Hash Table.
Introduction to HashingHash Table Data Structure Overview
- It is one of the most widely used data structure after arrays.
- It mainly supports search, insert and delete in O(1) time on average which is more efficient than other popular data structures like arrays, Linked List and Self Balancing BST.
- We use hashing for dictionaries, frequency counting, maintaining data for quick access by key, etc.
- Real World Applications include Database Indexing, Cryptography, Caches, Symbol Table and Dictionaries.
- There are mainly two forms of hash typically implemented in programming languages.
Hash Set : Collection of unique keys (Implemented as Set in Python, Set in JavaScrtipt, unordered_set in C++ and HashSet in Java.
Hash Map : Collection of key value pairs with keys being unique (Implemented as dictionary in Python, Map in JavaScript, unordered_map in C++ and HashMap in Java)
Situations Where Hash is not Used
- Need to maintain sorted data along with search, insert and delete. We use a self balancing BST in these cases.
- When Strings are keys and we need operations like prefix search along with search, insert and delete. We use Trie in these cases.
- When we need operations like floor and ceiling along with search, insert and/or delete. We use Self Balancing BST in these cases.
Components of Hashing
There are majorly three components of hashing:
- Key: A Key can be anything string or integer which is fed as input in the hash function the technique that determines an index or location for storage of an item in a data structure.
- Hash Function: Receives the input key and returns the index of an element in an array called a hash table. The index is known as the hash index .
- Hash Table: Hash table is typically an array of lists. It stores values corresponding to the keys. Hash stores the data in an associative manner in an array where each data value has its own unique index.

How does Hashing work?
Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.
- Step 1: We know that hash functions (which is some mathematical formula) are used to calculate the hash value which acts as the index of the data structure where the value will be stored.
- Step 2: So, let's assign
- “a” = 1,
- “b”=2, .. etc, to all alphabetical characters.
- Step 3: Therefore, the numerical value by summation of all characters of the string:
- “ab” = 1 + 2 = 3,
- “cd” = 3 + 4 = 7 ,
- “efg” = 5 + 6 + 7 = 18
- Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that is used here is the sum of the characters in key mod Table size . We can compute the location of the string in the array by taking the sum(string) mod 7 .
- Step 5: So we will then store
- “ab” in 3 mod 7 = 3,
- “cd” in 7 mod 7 = 0, and
- “efg” in 18 mod 7 = 4.

The above technique enables us to calculate the location of a given string by using a simple hash function and rapidly find the value that is stored in that location. Therefore the idea of hashing seems like a great way to store (key, value) pairs of the data in a table.
What is a Hash function?
A hash function creates a mapping from an input key to an index in hash table, this is done through the use of mathematical formulas known as hash functions. For example: Consider phone numbers as keys and a hash table of size 100. A simple example hash function can be to consider the last two digits of phone numbers so that we have valid array indexes as output. A good hash function should have the following properties:
- Efficient
- Should uniformly distribute the keys to each index of hash table.
- Should minimize collisions (This and the below are mainly derived from the above 2nd point)
- Should have a low load factor (number of items in the table divided by the size of the table).
What is Collision in Hashing?
When two or more keys have the same hash value, a collision happens. If we consider the above example, the hash function we used is the sum of the letters, but if we examined the hash function closely then the problem can be easily visualised that for different strings same hash value is being generated by the hash function.
For example: {“ab”, “ba”} both have the same hash value, and string {“cd”,”be”} also generate the same hash value, etc. This is known as collision and it creates problem in searching, insertion, deletion, and updating of value.
Collision in HashingThe probability of a hash collision depends on the size of the algorithm, the distribution of hash values and the efficiency of Hash function. To handle this collision, we use Collision Resolution Techniques.
What is meant by Load Factor in Hashing?
The load factor of the hash table can be defined as the number of items the hash table contains divided by the size of the hash table. Load factor is the decisive parameter that is used when we want to rehash the previous hash function or want to add more elements to the existing hash table.
It helps us in determining the efficiency of the hash function i.e. it tells whether the hash function which we are using is distributing the keys uniformly or not in the hash table.
Load Factor = Total elements in hash table/ Size of hash table
What is Rehashing?
As the name suggests, rehashing means hashing again. Basically, when the load factor increases to more than its predefined value (the default value of the load factor is 0.75), the complexity increases. So to overcome this, the size of the array is increased (doubled) and all the values are hashed again and stored in the new double-sized array to maintain a low load factor and low complexity.
How to Create Your Own Hash Table?
Similar Reads
Basics & Prerequisites
Data Structures
Array Data StructureIn this article, we introduce array, implementation in different popular languages, its basic operations and commonly seen problems / interview questions. An array stores items (in case of C/C++ and Java Primitive Arrays) or their references (in case of Python, JS, Java Non-Primitive) at contiguous
3 min read
String in Data StructureA string is a sequence of characters. The following facts make string an interesting data structure.Small set of elements. Unlike normal array, strings typically have smaller set of items. For example, lowercase English alphabet has only 26 characters. ASCII has only 256 characters.Strings are immut
2 min read
Hashing in Data StructureHashing is a technique used in data structures that efficiently stores and retrieves data in a way that allows for quick access. Hashing involves mapping data to a specific index in a hash table (an array of items) using a hash function. It enables fast retrieval of information based on its key. The
2 min read
Linked List Data StructureA linked list is a fundamental data structure in computer science. It mainly allows efficient insertion and deletion operations compared to arrays. Like arrays, it is also used to implement other data structures like stack, queue and deque. Hereâs the comparison of Linked List vs Arrays Linked List:
2 min read
Stack Data StructureA Stack is a linear data structure that follows a particular order in which the operations are performed. The order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is inserted last, comes out first and FILO implies that the element that is inserted first
2 min read
Queue Data StructureA Queue Data Structure is a fundamental concept in computer science used for storing and managing data in a specific order. It follows the principle of "First in, First out" (FIFO), where the first element added to the queue is the first one to be removed. It is used as a buffer in computer systems
2 min read
Tree Data StructureTree Data Structure is a non-linear data structure in which a collection of elements known as nodes are connected to each other via edges such that there exists exactly one path between any two nodes. Types of TreeBinary Tree : Every node has at most two childrenTernary Tree : Every node has at most
4 min read
Graph Data StructureGraph Data Structure is a collection of nodes connected by edges. It's used to represent relationships between different entities. If you are looking for topic-wise list of problems on different topics like DFS, BFS, Topological Sort, Shortest Path, etc., please refer to Graph Algorithms. Basics of
3 min read
Trie Data StructureThe Trie data structure is a tree-like structure used for storing a dynamic set of strings. It allows for efficient retrieval and storage of keys, making it highly effective in handling large datasets. Trie supports operations such as insertion, search, deletion of keys, and prefix searches. In this
15+ min read
Algorithms
Searching AlgorithmsSearching algorithms are essential tools in computer science used to locate specific items within a collection of data. In this tutorial, we are mainly going to focus upon searching in an array. When we search an item in an array, there are two most common algorithms used based on the type of input
2 min read
Sorting AlgorithmsA Sorting Algorithm is used to rearrange a given array or list of elements in an order. For example, a given array [10, 20, 5, 2] becomes [2, 5, 10, 20] after sorting in increasing order and becomes [20, 10, 5, 2] after sorting in decreasing order. There exist different sorting algorithms for differ
3 min read
Introduction to RecursionThe process in which a function calls itself directly or indirectly is called recursion and the corresponding function is called a recursive function. A recursive algorithm takes one step toward solution and then recursively call itself to further move. The algorithm stops once we reach the solution
14 min read
Greedy AlgorithmsGreedy algorithms are a class of algorithms that make locally optimal choices at each step with the hope of finding a global optimum solution. At every step of the algorithm, we make a choice that looks the best at the moment. To make the choice, we sometimes sort the array so that we can always get
3 min read
Graph AlgorithmsGraph is a non-linear data structure like tree data structure. The limitation of tree is, it can only represent hierarchical data. For situations where nodes or vertices are randomly connected with each other other, we use Graph. Example situations where we use graph data structure are, a social net
3 min read
Dynamic Programming or DPDynamic Programming is an algorithmic technique with the following properties.It is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for the same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of
3 min read
Bitwise AlgorithmsBitwise algorithms in Data Structures and Algorithms (DSA) involve manipulating individual bits of binary representations of numbers to perform operations efficiently. These algorithms utilize bitwise operators like AND, OR, XOR, NOT, Left Shift, and Right Shift.BasicsIntroduction to Bitwise Algorit
4 min read
Advanced
Segment TreeSegment Tree is a data structure that allows efficient querying and updating of intervals or segments of an array. It is particularly useful for problems involving range queries, such as finding the sum, minimum, maximum, or any other operation over a specific range of elements in an array. The tree
3 min read
Pattern SearchingPattern searching algorithms are essential tools in computer science and data processing. These algorithms are designed to efficiently find a particular pattern within a larger set of data. Patten SearchingImportant Pattern Searching Algorithms:Naive String Matching : A Simple Algorithm that works i
2 min read
GeometryGeometry is a branch of mathematics that studies the properties, measurements, and relationships of points, lines, angles, surfaces, and solids. From basic lines and angles to complex structures, it helps us understand the world around us.Geometry for Students and BeginnersThis section covers key br
2 min read
Interview Preparation
Practice Problem