Hashing Refers To The Process of Generating A Fixed-Size Output From An Input of Variable Size
Hashing Refers To The Process of Generating A Fixed-Size Output From An Input of Variable Size
Hashing refers to the process of generating a fixed-size output from an input of variable size
using the mathematical formulas known as hash functions. This technique determines an index
or location for the storage of an item in a data structure.
Components of Hashing
There are majorly three components of hashing:
1. Key: A Key can be anything string or integer which is fed as input in the hash function the
technique that determines an index or location for storage of an item in a data structure.
2. Hash Function: The hash function receives the input key and returns the index of an
element in an array called a hash table. The index is known as the hash index.
3. Hash Table: Hash table is a data structure that maps keys to values using a special
function called a hash function. Hash stores the data in an associative manner in an array
where each data value has its own unique index.
Double hashing is a collision resolution technique used in hash tables. It works by using two
hash functions to compute two different hash values for a given key. The first hash function is
used to compute the initial hash value, and the second hash function is used to compute the
step size for the probing sequence.
Double hashing has the ability to have a low collision rate, as it uses two hash functions to
compute the hash value and the step size. This means that the probability of a collision
occurring is lower than in other collision resolution techniques such as linear probing or
quadratic probing.
However, double hashing has a few drawbacks. First, it requires the use of two hash functions,
which can increase the computational complexity of the insertion and search operations.
Second, it requires a good choice of hash functions to achieve good performance. If the hash
functions are not well-designed, the collision rate may still be high.
Advantages of Double hashing
The advantage of Double hashing is that it is one of the best forms of probing, producing a
uniform distribution of records throughout a hash table.
This technique does not yield any clusters.
It is one of the effective methods for resolving collisions.
Here hash1() and hash2() are hash functions and TABLE_SIZE is size of hash table.
In this article, we will discuss the Binary search tree. This article will be very helpful and
informative to the students with technical background as it is an important topic of their course.
Before moving directly to the binary search tree, let's first see a brief description of the tree.
What is a tree?
A tree is a kind of data structure that is used to represent the data in hierarchical form. It can be
defined as a collection of objects or entities called as nodes that are linked together to simulate a
hierarchy. Tree is a non-linear data structure as the data in a tree is not stored linearly or
sequentially.
Similarly, we can see the left child of root node is greater than its left child and smaller than its
right child. So, it also satisfies the property of binary search tree. Therefore, we can say that the
tree in the above image is a binary search tree.
Suppose if we change the value of node 35 to 55 in the above tree, check whether the tree will be
binary search tree or not.
In the above tree, the value of root node is 40, which is greater than its left child 30 but smaller
than right child of 30, i.e., 55. So, the above tree does not satisfy the property of Binary search
tree. Therefore, the above tree is not a binary search tree.
Basic Operations:
1. Insertion in Binary Search Tree
2. Searching in Binary Search Tree
3. Deletion in Binary Search Tree
4. Binary Search Tree (BST) Traversals – Inorder, Preorder, Post Order
5. Convert a normal BST to Balanced BST
3.
How to Insert a value in a Binary Search Tree:
A new key is always inserted at the leaf by maintaining the property of the binary search tree.
We start searching for a key from the root until we hit a leaf node. Once a leaf node is found,
the new node is added as a child of the leaf node. The below steps are followed while we try to
insert a node into a binary search tree:
Check the value to be inserted (say X) with the value of the current node (say val) we are
in:
If X is less than val move to the left subtree.
Otherwise, move to the right subtree.
Once the leaf node is reached, insert X to its right or left based on the relation
between X and the leaf node’s value.
Illustration:
Ins
Insertion in BST
4.
A B-tree is a self-balancing tree where all the leaf nodes are at the same level which allows for
efficient searching, insertion and deletion of records.
Because of all the leaf nodes being on the same level, the access time of data is fixed
regardless of the size of the data set.
Characteristics of B-Tree?
B-trees have several important characteristics that make them useful for storing and retrieving
large amounts of data efficiently. Some of the key characteristics of B-trees are:
Balanced: B-trees are balanced, meaning that all leaf nodes are at the same level. This
ensures that the time required to access data in the tree remains constant, regardless of the
size of the data set.
Self-balancing: B-trees are self-balancing, which means that as new data is inserted or old
data is deleted, the tree automatically adjusts to maintain its balance.
Multiple keys per node: B-trees allow multiple keys to be stored in each node. This
allows for efficient use of memory and reduces the height of the tree, which in turn reduces
the number of disk accesses required to retrieve data.
Ordered: B-trees maintain the order of the keys, which makes searching and range queries
efficient.
Efficient for large data sets: B-trees are particularly useful for storing and retrieving large
amounts of data, as they minimize the number of disk accesses required to find a particular
piece of data.
Applications of B-Tree:
B-trees are commonly used in applications where large amounts of data need to be stored and
retrieved efficiently. Some of the specific applications of B-trees include:
Databases: B-trees are widely used in databases to store indexes that allow for efficient
searching and retrieval of data.
File systems: B-trees are used in file systems to organize and store files efficiently.
Operating systems: B-trees are used in operating systems to manage memory efficiently.
Network routers: B-trees are used in network routers to efficiently route packets through
the network.
DNS servers: B-trees are used in Domain Name System (DNS) servers to store and
retrieve information about domain names.
Compiler symbol tables: B-trees are used in compilers to store symbol tables that allow
for efficient compilation of code.
Advantages of B-Tree:
B-trees have several advantages over other data structures for storing and retrieving large
amounts of data. Some of the key advantages of B-trees include:
Sequential Traversing: As the keys are kept in sorted order, the tree can be traversed
sequentially.
Minimize disk reads: It is a hierarchical structure and thus minimizes disk reads.
Partially full blocks: The B-tree has partially full blocks which speed up insertion and
deletion.
Disadvantages of B-Tree:
Complexity: B-trees can be complex to implement and can require a significant amount of
programming effort to create and maintain.
Overhead: B-trees can have significant overhead, both in terms of memory usage and
processing time. This is because B-trees require additional metadata to maintain the tree
structure and balance.
Not optimal for small data sets: B-trees are most effective for storing and retrieving large
amounts of data. For small data sets, other data structures may be more efficient.
Limited branching factor: The branching factor of a B-tree determines the number of
child nodes that each node can have. B-trees typically have a fixed branching factor, which
can limit their performance for certain types of data.