0% found this document useful (0 votes)

168 views33 pages

Hashing Part 1 Lecture

This a lecture that gives an introduction about hashing techniques in files processing and organization

Uploaded by

Yousef Okasha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

168 views33 pages

Hashing Part 1 Lecture

This a lecture that gives an introduction about hashing techniques in files processing and organization

Uploaded by

Yousef Okasha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Hashing

Lec12- spring 2019

PRESENTER BY: DR EMAD NABIL
Lecture Overview

• Introduction to Hashing
• Hash functions
• Distribution of records among addresses
• synonyms and collisions
• Collision resolution by progressive overflow or linear probing
Motivation
• Hashing is a useful searching technique which can be used for implementing
indexes.

• The main motivation for Hashing is improving searching time.

• Below we show how the search time for Hashing compares to the one for
other methods
• using binary search O(log2 N)
• Hashing O(1)
What is Hashing ?
• The idea is to discover the location of a key by simply examining the
key.
• For that we need to design a hash function.

• A Hash Function is:

• a function h(k) that transforms a key into an address.

key f address
• An address space is chosen before hand.
• For example we may decide the file will have 1000 available addresses.

• If U is the set of all possible keys, the hash function is from U to

{0,1,...,999}
Address space
0 999
Hashing Example
Collision

• LOWELL, LOCK, OLIVER, and any word with first two letters L and O will be
mapped to the same address
h(LOWELL) = h(LOCK) = h(OLIVER) = 4
• These keys are called synonyms
• The address "4" is said to be the home address of any of these keys.

• Two different keys may be sent to the same address generating a

Collision
Collision resolution
• Avoiding collisions is extremely difficult
• Ways of reducing collisions
1. Spread out the records by choosing a good hash function

2. Use extra memory, i.e. increase the size of the address space.
ex: reserve 5000 available addresses rather than 1000

3. Put more than one record at a single address use of buckets

 Addresses generated from the key are uniformly and randomly distributed

 The hashing function must minimize the collision

8
1. Division Method
2. Multiplication Method
3. Extraction Method
4. Mid-Square Hashing
5. Folding Technique
6. Rotation
7. Universal Hashing

9
 One of the required features of the hash
function is that the resultant index must
be within the table index range

 One simple choice for a hash function is

to use the modulus division indicated as
MOD (the operator % in C/C++)

 The function returns an integer

 If any parameter is NULL, the result is

NULL
 Hash(Key) = Key % m m=307
m is the hash table size
10
 The multiplication method works as:
1. Multiply the key ‘Key’ K by a constant A in the range 0 < A < 1
2. extract the fractional part of k*A
3. Multiply this value by m and take the floor of the result.

where ( kA mod 1) denotes the fractional part of kA  kA−floor(kA) .

The optimal choice of A depends on the characteristics of the data being hashed. Knuth recommends

m is the hash table size

11
2. Multiplication Method Example
 When a portion of the key is used for the address calculation, the
technique is called as the extraction method

 In digit extraction, few digits are selected and extracted from the
key which are used as the address

Key Hashed Address

345678 357
234137 243
952671 927

13
 The mid-square hashing suggests to take square of the key and extract the middle digits of the
squared key as address
 The difficulty is when the key is large. As the entire key participates in the address calculation,
if the key is large, then it is very difficult to store the square of it as the square of key should not
exceed the storage limit
 So mid-square is used when the key size is less than or equal to 4 digits

Key Square Hashed Address

2341 5480281 802
1671 2792241 922

The difficulty of storing larger numbers square can be overcome if for squaring
we use few of digits of key instead of the whole key 14
We can select a portion of key if key is larger in size and then square the portion of it

Keys and addresses using extracting few digits, squaring them, and again extracting mid

Key Square Hashed

Address
234137 234 x 234 = 027889 788
567187 567 x 567 = 321489 148

15
5-

1
1

 the size of subparts of key could be as that of the address

5-Folding Technique
To compute this hash function apply 3 steps
• Step 1. Transform the key into a number
• Step 2. Fold and add and take the mod by a prime number

• Step 3. Divide by the size of the address space (preferably a prime number.)
• dividing by a number that has many small factors may result in lots of collisions.
 When keys are serial, they vary in only last digit and this leads to the creation of synonyms
 Rotating key would minimize this problem. This method is used along with other methods
 Here, the key is rotated right by one digit and then use of folding would avoid synonym

 For example, let the key be 120605, when it is rotated we get 512060
 Then further the address is calculated using any other hash function

19
Some Other Hashing Methods
• Radix Transformation
• Transform the number into another base and then divide
by the maximum address.
If Hash(Key1) = Hash(Key2)
then
Key1 and Key2
are

synonyms
and collision happens
Consider the hash value is the RRN and
we working on fixed length records 21
Distribution of Records among Addresses

• Uniform distributions are extremely rare.

• Random distributions are acceptable and more easily obtainable.
1. Open addressing
The first collision resolution method, open addressing, resolves collisions in the home area. When a collision
occurs, the home area addresses are searched for an open or unoccupied element where the new data can be
placed. Examples of Open Addressing Methods:
1.1. Linear probing or progressive overflow
1.2. Quadratic probing
1.3. Double hashing

2. Bucket hashing (defers collision but does not prevent it)

3. Separate chaining
4. Separate chaining with overflow area

23
1.1. open addressing
Progressive Overflow or linear probing

H = F(key)
is the home address. If it is available we store the record, otherwise, we increase H by k,
H = (H + k) mod tableSize, (k ≥1)
Collision Resolution: Progressive Overflow
Any
suggestion !!
Collision Resolution: Progressive Overflow
• Advantage
• Simplicity
• Disadvantage
• If there are lots of collisions, clusters of records can form as in the previous
example.
1.2. Quadratic Probing
H = F(key)
H= (H + i2 )% tablesize, i≥ 𝟏

• Quadratic Probe If there is a collision at hash address H,

• this method probes/explores the table at locations h+1, h+4, h+9, ...,
• that is, at locations H + i^2 (mod tablesize) for i = 1, 2, ....
• That is, the increment function is i^2.
• Quadratic probing substantially reduces clustering, but it will not probe/explore all
locations in the table.
1.3. Double Hashing

6+4 = 10
1.3. Double Hashing

H =F1(key)  to compute the home address

Step =F2(key)
Table size =M
H = (H + i*step)%M, i>= 0repeat this until we find a place or we find the start point again.

Double hashing represents an improvement over linear or quadratic probing

Double Hashing uses nonlinear probing by computing different probe increments for different keys.
It uses two functions.

The first function computes the original address, if the slot is available (or the record is found) we stop there,
otherwise, we apply the second hashing function to compute the step value.
1. Open addressing
The first collision resolution method, open addressing, resolves collisions in the home area. When a collision
occurs, the home area addresses are searched for an open or unoccupied element where the new data can be
placed. Examples of Open Addressing Methods:
1.1. Linear probing or progressive overflow
1.2. Quadratic probing
1.3. Double hashing

2. Bucket hashing (defers collision but does not prevent it)

3. Separate chaining
4. Separate chaining with overflow area

S Norris
No ratings yet
S Norris
9 pages
哲蚌寺藏文古籍目录（上冊）
No ratings yet
哲蚌寺藏文古籍目录（上冊）
1,379 pages
Hashing Problem Set Solutions
No ratings yet
Hashing Problem Set Solutions
3 pages
Lab8 Hash
No ratings yet
Lab8 Hash
43 pages
wk3 3
No ratings yet
wk3 3
111 pages
AES Based Hash
No ratings yet
AES Based Hash
195 pages
Hashing 2
No ratings yet
Hashing 2
59 pages
CNS Unit-4
No ratings yet
CNS Unit-4
35 pages
Chapter 11 Hashing
No ratings yet
Chapter 11 Hashing
42 pages
Lecture01 Introduction 01
No ratings yet
Lecture01 Introduction 01
69 pages
Hafta.
No ratings yet
Hafta.
34 pages
Collision Resolution Techniques
No ratings yet
Collision Resolution Techniques
10 pages
Hashing
No ratings yet
Hashing
25 pages
2,2 Hashing
No ratings yet
2,2 Hashing
30 pages
VHDL Realisation of SHA-256 Algorithm
No ratings yet
VHDL Realisation of SHA-256 Algorithm
59 pages
Keccak Slides at NIST
No ratings yet
Keccak Slides at NIST
71 pages
Hashing
No ratings yet
Hashing
23 pages
Class-8 - Indexing
No ratings yet
Class-8 - Indexing
20 pages
Lect 13 Hashing Part2
No ratings yet
Lect 13 Hashing Part2
38 pages
Hash Tables in DS
No ratings yet
Hash Tables in DS
14 pages
c07 Crypto Hashing12
No ratings yet
c07 Crypto Hashing12
47 pages
Data and File Structures: Hashing
No ratings yet
Data and File Structures: Hashing
24 pages
Analysis of Various Hash Function
No ratings yet
Analysis of Various Hash Function
4 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Hash
No ratings yet
Hash
7 pages
Dsa 1
No ratings yet
Dsa 1
6 pages
4-1 Cns r20 Unit - 4
No ratings yet
4-1 Cns r20 Unit - 4
30 pages
Hashing
No ratings yet
Hashing
44 pages
hw3 Sols
No ratings yet
hw3 Sols
5 pages
Lec 5
No ratings yet
Lec 5
23 pages
Assignment 2 - Group Assignment
No ratings yet
Assignment 2 - Group Assignment
6 pages
DSA Unit VI Hashing and File Organization
No ratings yet
DSA Unit VI Hashing and File Organization
56 pages
CH 09
No ratings yet
CH 09
32 pages
Probabilistic Data Structures
No ratings yet
Probabilistic Data Structures
26 pages
Dbms r18 Unit 5 Notes
No ratings yet
Dbms r18 Unit 5 Notes
24 pages
Hashing
No ratings yet
Hashing
30 pages
(Update) Post Test Week 6 - Attempt Review
No ratings yet
(Update) Post Test Week 6 - Attempt Review
5 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
14.1 App Lifecycle
No ratings yet
14.1 App Lifecycle
1 page
CNS U-3
No ratings yet
CNS U-3
24 pages
Hashing
No ratings yet
Hashing
75 pages
HAshing (Satish Sir)
No ratings yet
HAshing (Satish Sir)
52 pages
Hash Tables: Dr. Dibakar Saha
No ratings yet
Hash Tables: Dr. Dibakar Saha
26 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
HASHING
No ratings yet
HASHING
63 pages
What Does "Responsive" Mean?: Portrait-Mode Phone Landscape-Mode Phone Tablet, Desktop PC
No ratings yet
What Does "Responsive" Mean?: Portrait-Mode Phone Landscape-Mode Phone Tablet, Desktop PC
3 pages
Week13 1
No ratings yet
Week13 1
16 pages
Wa0024.
No ratings yet
Wa0024.
11 pages
Module 5
No ratings yet
Module 5
33 pages
08 Hashing
No ratings yet
08 Hashing
26 pages
Ds 17hashing
No ratings yet
Ds 17hashing
27 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
CSE 203#08 Hashing
No ratings yet
CSE 203#08 Hashing
7 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
3.1 Widget-And-Element-Tree
No ratings yet
3.1 Widget-And-Element-Tree
1 page
Chapter 4 Hashing and File Structure
No ratings yet
Chapter 4 Hashing and File Structure
46 pages
Ders7 - Data Structures and Search Algorithms
No ratings yet
Ders7 - Data Structures and Search Algorithms
41 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Hash Function of Finalist SHA 3 Analysis Study
No ratings yet
Hash Function of Finalist SHA 3 Analysis Study
12 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Hashing
No ratings yet
Hashing
23 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Lab. Sheet Seven: Data Structure Laboratory
No ratings yet
Lab. Sheet Seven: Data Structure Laboratory
7 pages
Hashing
No ratings yet
Hashing
42 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
30 pages
DS 5
No ratings yet
DS 5
23 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
CME 2201-Lab6
No ratings yet
CME 2201-Lab6
7 pages
Hashing Part1 - 241021 - 152911
No ratings yet
Hashing Part1 - 241021 - 152911
10 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashing
No ratings yet
Hashing
20 pages
HMAC Algorithm Stands For Hashed or Hash Based Message Authentication Code
No ratings yet
HMAC Algorithm Stands For Hashed or Hash Based Message Authentication Code
4 pages
Dsa Labtask 12
No ratings yet
Dsa Labtask 12
5 pages
Ads M Tech Mid 2
No ratings yet
Ads M Tech Mid 2
26 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing Slide
No ratings yet
Hashing Slide
16 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Hashing
No ratings yet
Hashing
34 pages
Hashing
No ratings yet
Hashing
4 pages
Study Material On Hashing
No ratings yet
Study Material On Hashing
4 pages
SHA-224/256 Based Digital Signature Using FPGA: Lalitha Sowmya M & Prof.P.Ravikanth
No ratings yet
SHA-224/256 Based Digital Signature Using FPGA: Lalitha Sowmya M & Prof.P.Ravikanth
6 pages
Hash Function
No ratings yet
Hash Function
9 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Basic Math Notes
From Everand
Basic Math Notes
Ernest Bywater
5/5 (2)

Hashing Part 1 Lecture

Uploaded by

Hashing Part 1 Lecture

Uploaded by

Hashing

Lec12- spring 2019

• The main motivation for Hashing is improving searching time.

• A Hash Function is:

• If U is the set of all possible keys, the hash function is from U to

• Two different keys may be sent to the same address generating a

3. Put more than one record at a single address use of buckets

 The hashing function must minimize the collision

 One simple choice for a hash function is

 The function returns an integer

 If any parameter is NULL, the result is

where ( kA mod 1) denotes the fractional part of kA  kA−floor(kA) .

m is the hash table size

Key Hashed Address

Key Square Hashed Address

Key Square Hashed

 the size of subparts of key could be as that of the address

• Uniform distributions are extremely rare.

2. Bucket hashing (defers collision but does not prevent it)

• Quadratic Probe If there is a collision at hash address H,

H =F1(key)  to compute the home address

Double hashing represents an improvement over linear or quadratic probing

2. Bucket hashing (defers collision but does not prevent it)

You might also like