0% found this document useful (0 votes)
23 views4 pages

As 3

Uploaded by

shalalala213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views4 pages

As 3

Uploaded by

shalalala213
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Sarthak Kar

Understanding Hash Tables in Computer Science

In the early days of computer science, storing and retrieving data efficiently
became a significant challenge. As the amount of information grew, finding a
method that could quickly and accurately access this data became crucial.
Traditional data structures struggled to meet the increasing demand for speed and
efficiency, until hash tables were introduced. This data structure offered a more
effective way to store and retrieve data, transforming big data management and
becoming an essential tool in the field of computer science.

A hash table data structure consists of an array, called a table or bucket


array, and a hash function. Each space in the array is called a bucket and can hold
one or more key-value pair. The value is the data that we want to be stored, while
the hash function is responsible for computing location in the array from the key.
The hash function is a critical component of a hash table. It takes the key as input
and produces an integer, which is later used as an index in the array. The value
paired with the key is then stored at that specific bucket of the array A good hash
function distributes keys uniformly across the buckets, minimizing the chances of
collisions i.e., multiple keys producing the same index. It should be deterministic,
meaning that the same key should always produce the same hash value. Some
common hash functions include the division method, multiplication method, and
universal hashing. A hash table supports 3 operations by the user: Insert, Search
and Delete. These operations have a time complexity O(1) i.e. constant time
complexity, which means each operation takes the same exact time every time they
are executed, no matter how big the hash table is.

Insertion into hash table

Insertion is the addition of a new key-value pair to the hash table. The hash
function outputs the index using the given key, and the key-value pair is stored in
the corresponding bucket. If the bucket already contains one or more elements, the
new pair is added making use of collision resolution techniques.

Searching the hash table

Search involves looking up a value using the given key. The hash function
calculates the index for the key, and the bucket at that index is checked. If bucket
also contains the key, the paired value is returned.

Deletion from hash table

Deletion is removing a key-value pair from the hash table. The hash
function determines the index with the key, and the corresponding bucket is
searched for the key. If found, the key-value pair is removed using the appropriate
collision resolution technique.

Collision Resolution Techniques

A collision in a hash table occurs when more than one key gives the same
index as output using the hash function. Several techniques have been introduced
to deal with collisions effectively.

Chaining

This algorithm tries to tackle the problem by implementing a linked list of


elements at each bucket. When a collision occurs, the new key-value pair is added
as a node linked with the previous key-value pair. As more collisions occur at the
same index, the linked list keeps getting longer creating a chain, hence the name
“Chaining”. Chaining is a simple algorithm to implement and can handle any
number of collisions until it runs out of memory for the lists.

Open Addressing

Open addressing technique tries to find another open bucket in the array to
put the new pair whenever a collision occurs. There are a few ways it tries to look
for an open spot. Such as linear probing is the mechanism to keep checking every
bucket serially starting from the index of collision, while quadratic probing is the
checking of every other quadratic interval. Another hash function can also be
introduced to help with collision problems. The new hash function can be used to
generate an arbitrary number which is later added with the original index and the
new key-value pair is stored at the sum index. Double hashing reduces the
probability for collisions to occur as two functions are being used. However, it costs
slightly more computational power due to the additional function.

Rehashing

To measure the fullness of a hash table, a variable known as “load factor”


is used. The load factor of a hash table is calculated as the ratio of the number of
stored elements to the total number of buckets. So a high load factor indicates that
the hash table is almost full and as such more collisions are likely to happen, thereby
reducing efficiency. When the load factor exceeds a predefined threshold, rehashing
is triggered. Rehashing is done by creating a new, larger hash table and
redistributing all existing elements into this new table using a new hash function.
This process helps to reduce the load factor and eventually minimize collisions.

Applications of Hash Tables

Hash tables are used in various applications due to their efficiency and
simplicity:
Database Indexing

A database is simply a list that stores data. So, a hash table can be used
instead to index data. This enables quick searches insertions and deletions of
records. And most databases have unique keys already for hashing, such as student
IDs or barcodes.

Caching

Computer processors have to frequently access data from a temporary,


faster but limited storage area to reduce RAM access times. This mechanism is
called caching. Hash tables are used to store these cached data to help the processor
instant memory lookups and updates. This is also a key component in implemented
caches for web browsers.

Password Storage

Hashing can also be used as a security measure as a hash function is very


similar to encryption. When a user saves a password in their browser, instead of
storing the password directly, a hash of the password is generated using a unique
hash function for that specific user, and stored in the database. This hash can be
used again by the browser again to retrieve the password. Therefore, if the database
is compromised, only the hashes are exposed and not the actual passwords. And
since the hash function is not known by others, the hashes cannot be decrypted.

Error Detection and Correction

When data is transmitted over a network, it is prone to corruption from


interference, noise, or other disturbances. To check if the data has been altered, a
hash code (also known as a checksum or hash value) is computed using the original
data before transmission and sent with the data. On the other side, the receiving end
computes the hash code for the received data and compares it with the transmitted
hash code. If both the codes match, the data can be considered intact since the hash
function is unique. Otherwise, that data is regarded as corrupted and ignored. This
simple yet effective technique allows systems to detect errors and, in some cases,
correct them by requesting retransmission of the corrupted data, thereby
maintaining the reliability and accuracy of communication networks.

Conclusion
Hash table’s efficiency makes it a vital data structure in computer science.
Because they can perform most operations in constant time on average, they are
perfect for applications that need to access data frequently. To make the most out
of hash tables, it’s important to understand their structure, their basic operations,
and the ways they handle collisions. By utilizing these concepts, developers can
make full use of hash tables and boost the performance in computer software.
References

GeeksforGeeks. (2024, May 1). Hashing in data structure. GeeksforGeeks.


https://www.geeksforgeeks.org/hashing-data-structure/#collision-
resolution-techniques

You might also like