0% found this document useful (0 votes)
2 views

Hash Function

A hash function maps arbitrary-sized data to fixed-size values, known as hash values, which are used in hash tables for efficient data indexing. Hashing is a one-way process that produces a fixed-length output regardless of input size, and various algorithms exist, such as MD5 and SHA-256. Collision resolution techniques include open hashing (chaining) and closed hashing (open addressing), each addressing how to handle multiple values mapping to the same hash index.

Uploaded by

soveme7713
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Hash Function

A hash function maps arbitrary-sized data to fixed-size values, known as hash values, which are used in hash tables for efficient data indexing. Hashing is a one-way process that produces a fixed-length output regardless of input size, and various algorithms exist, such as MD5 and SHA-256. Collision resolution techniques include open hashing (chaining) and closed hashing (open addressing), each addressing how to handle multiple values mapping to the same hash index.

Uploaded by

soveme7713
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Hash Function

 A hash function is any function that can be used to map data of arbitrary size to fixed-size
values.
 The values returned by a hash function are called hash values, hash codes, digests, or simply
hashes.
 The values are usually used to index a fixed-size table called a hash table.
 Definition: A hash function is a function that takes a set of inputs of any arbitrary size and fits
them into a table or other data structure that contains fixed-size elements.
 Hashing refers to the concept of taking any arbitrary amount of input data (any data – word
document, audio file, video file, executable file, etc.) and applying the hashing algorithm to it.
 The algorithm generates a gibberish output data called the ‘hash’ or ‘hash value’. This hash
value is also known as a message digest.

 There are two important things to note, however. One, hashing is a one-way function. It can be
used on any input data to generate a hash value. But, applying a hash function to a hash value
will not reveal the input data. Second, hashing always produces a fixed-length hash value,
irrespective of the length/ size of the input.

 A Few Examples

Input: My name is Ankita.


Hash: 6795D462DE738ED46BD0323B951651735A327007
Input: My name is Ankita. I am writing this article for MyLawrd- Technology Law and Policy.
Hash: 9525DGF5654DG54JF2GER6RD4XZ7V465Y6HI1J

 In the above examples, the hash is different for different inputs. Also, the input length varies,
but the size of the hash value remains same.

 A Few algorithms of Hash Functions

MD5- 128 bits


SHA1- 160 bits
SHA-224- 224 bits
SHA-256- 256 bits
SHA-384- 384 bits
SHA-512- 512 bits

 A hash value is often called as a checksum too. That’s why SHA256sum.


 You can compute the hash value of data using a software like ‘Hash My Files’ or ‘Hasher’.
 The software gives you an option to choose the hash function you want to run your data
through.
 Ahash tables if used properly are a boon to the human community as they save a lot of time in
searching important data, it is of utmost importance that we should know how to use hash
tables for our benefits.
 The amount by which how better hash tables function depend a lot on the type of hash function
being used to generate corresponding value of key
 Qualitative information about the keys can be useful in the designing of good hashing functions.
 An ideal hash function maps keys to hash values randomly, that is every bucket is equalll fulfilled
so as to keep the average complexity as low as possible, even though there are some patterns in
the input, and mapping of one key does not depend upon the mapping of oher keys, which is
also called as simple uniform hashing.

 The different examples of implementing Hash Functions are:


Division Hash
Knuth Variant on Division Hash
Multiplication Hashing
Hashing functions for strings
PJW hash
CRC variant of hashing
BUZ hash
Universal Hashing
Perfect Hashing
Division Hash

 Probably most common type of hash function to ever exist on this planet. It uses basic
properties of division to generate the values for the corresponding keys.
Function:
h(k) = k mod m
or in more general terms
h(k) = (ak+b) mod m //Where a and b are constants.
where k is the key and m is the size of our hash table.
 We should choose size which is a prime and not close to a power of 2. It does not work as
desired if there are some patterns in the input data.

Example:
Calculating a Hash Table:
 Formal definitions of hash functions vary from application to
application.
 Let’s take a simple example by taking each number mod 10, and
putting it into a hash table that has 10 slots.

Numbers to hash: 22, 3, 18, 29

 We take each value, apply the hash function to it, and the result tells
us what slot to put that value in, with the left column denoting the
slot, and the right column denoting what value is in that slot, if any.

 Our hash function here is to take each value mod 10. The table to
the right shows the resulting hash table. We hash a series of values as
we get them, so the first value we hash is the first value in the string of
values, and the last value we hash is the last value in the string of
values.

22 mod 10 = 2, so it goes in slot 2.


3 mod 10 = 3, so it goes in slot 3.
18 mod 10 = 8, so it goes in slot 8.
29 mod 10 = 9, so it goes in slot 9.
Collisions

Definition: A collision occurs when more than one value to be hashed by a particular hash function hash
to the same slot in the table or data structure (hash table) being generated by the hash function.

Example Hash Table With Collisions:

Let’s take the exact same hash function from before: take the
value to be hashed mod 10, and place it in that slot in the hash
table.

Numbers to hash: 22, 9, 14, 17, 42

Collision Resolution Techniques:

Open Hashing (Chaining)


Closed Hashing ( Open Addressing)

 Open Hashing or Chaining method creates an external chain of values that has the same index.
The chain is generated from that position as a linked list. Collision is resolved by storing multiple
values together in that same index.

 Closed Hashing or Open Addressing tries to utilize the empty indexes in a hash table for handling
collision. In this method, the size of the hash table needs to be larger than the number of keys
for storing all the elements.
Open Addressing
 Open addressing or closed hashing is the second most used method to resolve collision.
 This method aims to keep all the elements in the same table and tries to find empty slots for
values.
 Closed hashing refers to the fact that the values always stay stored in the hash table.
 Open addressing is named because the locations for the values are not fixed and can be
addressed to an empty slot if a collision happens.

 This method resolves collisions by probing or searching through the hash table for indexes that
are available for storing elements.

 Unlike open hashing or chaining, open addressing stores one value in each index. The basic
functions of this method are to add, remove or find an element.

It has different approaches for these functions but the well-known are:

Linear Probing
Quadratic Probing
Double Hashing

You might also like