0% found this document useful (0 votes)
31 views

A Hash Function

A hash function is an algorithm that maps data of variable length to data of a fixed length. Hash functions are mainly used to accelerate tasks like finding items in a database or detecting duplicate records by mapping values to smaller hash values. A good hash function will always generate the same hash value for equal inputs to ensure collisions can be resolved quickly when finding elements in a hash table. While the concept of hash functions has existed since the 1950s, designing efficient hash functions remains an active area of research due to properties required to minimize collisions as tables fill up.

Uploaded by

Ritesh Dewangan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

A Hash Function

A hash function is an algorithm that maps data of variable length to data of a fixed length. Hash functions are mainly used to accelerate tasks like finding items in a database or detecting duplicate records by mapping values to smaller hash values. A good hash function will always generate the same hash value for equal inputs to ensure collisions can be resolved quickly when finding elements in a hash table. While the concept of hash functions has existed since the 1950s, designing efficient hash functions remains an active area of research due to properties required to minimize collisions as tables fill up.

Uploaded by

Ritesh Dewangan
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

A hash function is any algorithm or subroutine that maps large data sets of variable length, called keys, to smaller

data sets of a fixed length. For example, a person's name, having a variable length, could be hashed to a single integer. The values returned by a hash function are called hash values, hash codes, hash sums, checksums or simply hashes.

Hash functions are mostly used to accelerate table lookup or data comparison tasks such as finding items in a database, detecting duplicated or similar records in a large file, finding similar stretches in DNA sequences, and so on. A hash function should be referentially transparent, i.e., if called twice on input that is "equal" (for example, strings that consist of the same sequence of characters), it should give the same result. This is a contract in many programming languages that allow the user to override equality and hash functions for an object: if two objects are equal, their hash codes must be the same. This is crucial to finding an element in a hash table quickly, because two of the same element would both hash to the same slot. Some hash functions may map two or more keys to the same hash value, causing a collision. Such hash functions try to map the keys to the hash values as evenly as possible because collisions become more frequent as hash tables fill up. Thus, single-digit hash values are frequently restricted to 80% of the size of the table. Depending on the algorithm used, other properties may be required as well, such as double hashing and linear probing. Although the idea was conceived in the 1950s,[1] the design of good hash functions is still a topic of active research. Hash functions are related to (and often confused with) checksums, check digits, fingerprints, randomization functions, error correcting codes, and cryptographic hash functions. Although these concepts overlap to some extent, each has its own uses and requirements and is designed and optimized differently. The HashKeeper database maintained by the American National Drug Intelligence Center, for instance, is more aptly described as a catalog of file fingerprints than of hash values.

You might also like