0% found this document useful (0 votes)

73 views35 pages

CSC508 Hashing

Hashing

Uploaded by

FATIN HUMAIRA ROSLIZAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

73 views35 pages

CSC508 Hashing

Hashing

Uploaded by

FATIN HUMAIRA ROSLIZAM

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

1

TOPIC 6B
Hashing

Zulaile Mabni
CHAPTER OBJECTIVES
▪ Learn about hashing
▪ Learn about hash methods
▪ Mid-square
▪ Folding
▪ Division
▪ Learn about collision
▪ Open addressing
▪ Chaining

2
▪ Need a data structure in which finds/searches are very fast
▪ Insert and Delete process should be fast too
▪ Objects have unique keys
▪ A key may be a single property/attribute value
▪ Or may be created from multiple properties/values

3
▪ Maximize efficiency: implement the operations Insert(),
Delete() and Search()/Find() efficiently.
▪ Arrays:
▪ not space efficient (assumes we leave empty space for keys
not currently in the structure)
▪ Linked List
▪ space efficient
▪ Insert(), Delete() and Search()/Find() not too efficient

▪ Hash Tables:
▪ Better than the above in terms of space and efficiency

4
▪ Very useful data structure
▪ Good for storing and retrieving key-value pairs
▪ Not good for iterating through a list of items

▪ Example applications:
▪ Storing objects according to ID numbers
▪ When the ID numbers are widely spread out
▪ When you don’t need to access items in ID order

5
▪ A hash value or hash index is used to index
the hash table (array)
▪ A hash function takes a key and returns a
hash value/index
▪ The hash index is a integer (to index an array)

▪ The key is specific value associated with a

specific object being stored in the hash table
▪ It is important that the key remain constant for the
lifetime of the object

6
▪ You want a hash function/algorithm that is:
▪ Fast
▪ Easy to compute
▪ Minimize the number of collisions
▪ Creates a good distribution of hash values so that the items
(based on their keys) are distributed evenly through the
array
▪ Hash functions can use as input
▪ Integer key values
▪ String key values
▪ Multipart key values
▪ Multipart fields, and/or
▪ Multiple fields

7
▪ The performance of the hash table depends on having a hash
function that evenly distributes the keys: uniform hashing is the
ideal target
▪ Choosing a good hash function requires taking into account the
kind of data that will be used.
▪ E.g., Choosing the first letter of a last name will likely cause
lots of collisions depending on the nationality of the
population.
▪ Most programming languages (including java) have hash
functions built in.

8
▪ Division (Modular arithmetic)
▪ key mod m
▪ m is the array size; in general, it should be prime number
▪ Key X is converted into an integer iX
▪ This integer divided by size of hash table to get remainder,
giving address of X in HT

9
▪ Stands for modulo
▪ When you divide x by y, you get a result and a remainder
▪ Mod is the remainder
▪ 8 mod 5 = 3
▪ 9 mod 5 = 4
▪ 10 mod 5 = 0
▪ 15 mod 5 = 0

▪ Thus for key-value mod M, multiples of M give the same

result, 0
▪ But multiples of other numbers do not give the same result

10
Hash Tables – Conceptual View
h(key) = key mod 8

table buckets obj1

7 key=15
b1
6
hash value/index

5 Obj3 Obj2
key=4 key=36
4 b2
3 b3
2 Obj4
key=2
1 b4
0 Obj5
key=1

11
Suppose that each key is a string. The following
Java method uses the division method to compute
the address of the key:

int hashmethod(String insertKey)

{
int sum = 0;
for(int j = 0; j <= insertKey.length(); j++)
sum = sum + (int)(insertKey.charAt(j));
return (sum % HTSize);
}//end hashmethod

12
▪ Mid-Square
▪ Hash method, h, computed by squaring the
identifier
▪ Using appropriate number of bits from the middle
of the square to obtain the bucket address
▪ Middle bits of a square usually depend on all the
characters, it is expected that different keys will
yield different hash addresses with high
probability, even if some of the characters are the
same

13
14
▪ Folding
▪ Key X is partitioned into parts such that all the parts, except
possibly the last parts, are of equal length
▪ Parts then added, in convenient way, to obtain hash address

15
= 105 % 100

16
▪ Usage summary:
int hashValue = hashFunction (int key);
▪ Or hashValue = hashFunction (String key);
▪ Or hashValue = hashFunction (itemType item);

▪ Insert method:
public void insert (int key, itemType item) {
hashValue = hashFunction (key);
table[hashValue] = item;
}

17
For example, if we hash keys 0…1000 into a hash table with 5
entries and use h(key) = key mod 5 , we get the following
sequence of events:

Insert 2 Insert 21 Insert 34 Insert 54

key data key data key data

0 0 0
1 1 21 … 1 21 … There is a
collision at
2 2 … 2 2 … 2 2 … array entry #4
3 3 3
4 4 4 34 … ???
18
▪ Algorithms to handle collisions
▪ Two categories of collision resolution techniques
▪ Open addressing (closed hashing)
▪ Chaining (open hashing)

19
▪ A problem arises when we have two keys that hash in the
same array entry – this is called a collision.
▪ There are two ways to resolve collision:

▪ Hashing with Chaining (a.k.a. “Separate Chaining”):

every hash table entry contains a pointer to a linked list
of keys that hash in the same entry

▪ Hashing with Open Addressing: every hash table entry

contains only one key. If a new key hashes to a table
entry which is filled, systematically examine other table
entries until you find one empty entry to place the new
key

20
The problem is that keys 34 and 54 hash in the same entry (4). We
solve this collision by placing all keys that hash in the same hash table
entry in a chain (linked list) or bucket (array) pointed by this entry:

Insert 54 other Insert 101

key key data
0 0
1 21 1 101 21
2 2 2 2
3 3
4 54 34 4 54 34

CHAIN
21
22
 Collisions are resolved by systematically examining other table
indexes, i0 , i1 , i2 , … until an empty slot is located.
▪ The key is first mapped to an array cell using the hash function (e.g.
key % array-size)
▪ If there is a collision find an available array cell
▪ There are different algorithms to find (to probe for) the next array cell
▪ Linear probing
▪ Quadratic probing
▪ Random probing
▪ Double Hashing

23
▪ Suppose that an item with key X is to be inserted in HT
▪ Use hash function to compute index h(X) of item in HT
▪ Suppose h(X) = t.
▪ If HT[t] is empty, store item into array slot.
▪ Suppose HT[t] already occupied by another item; collision
occurs
▪ Linear probing: starting at location t, search array sequentially to
find next available array slot:
▪ (t + 1) % HTSize, (t + 2) % HTSize,…,(t + j) % HTSize
▪ Be sure to wrap around the end of the array!
▪ Stop when you have tried all possible array indices
▪ If the array is full, you need to throw an exception or, better
yet, resize the array

24
Pseudocode implementing linear probing:

hIndex = hashmethod(insertKey);
found = false;
while(HT[hIndex] != emptyKey && !found)
if(HT[hIndex].key == key)
found = true;
else
hIndex = (hIndex + 1) % HTSize;
if(found)
System.out.println(”Duplicate items not allowed”);
else
HT[hIndex] = newItem;

25
26
▪ Uses a random number generator to find the
next available slot
▪ ith slot in the probe sequence is: (h(X) + ri) %
HTSize where ri is the ith value in a random
permutation of the numbers 1 to HTSize – 1
▪ Suppose HTSize = 101, for h(X) = 26, and r1 = 2, r2 =
5, r3 = 8.
▪ The probe sequence of X has the elements 26,
28,31,34
▪ Allinsertions and searches use the same
sequence of random numbers
27
▪ In Quadratic probing, starting at position t,
check the array locations ( t + 1²) % HTSize, (t
+ 2²) % HTSize,…, (t + i²) % HTSize.
▪ We do not know if it probes all the positions
in the table
▪ When HTSize is prime, quadratic probing
probes about half the table before repeating
the probe sequence

28
29
▪ Apply a second hash function after the first
▪ The second hash function, like the first, is dependent on the key
▪ Secondary hash function must
▪ Be different than the first
▪ And, obviously, not generate a zero

▪ Good algorithm:
▪ arrayIndex = (arrayIndex + stepSize) % arraySize;
▪ Where stepSize = constant – (key % constant)
▪ And constant is a prime less than the array size

30
31
32
33
▪ http://www.cs.auckland.ac.nz/software/AlgAnim/hash_tables.h
tml

34
▪ Malik D.S., Nair P.S., Data Structures Using Java, Course
Technology, 2003.

▪ Weiss Mark Allen, Data Structures & Algorithm Analysis in C++,

Pearson Education International Inc, 2003.

n8n_Tips_and_Tricks
No ratings yet
n8n_Tips_and_Tricks
7 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
Topic 6 Hashing
No ratings yet
Topic 6 Hashing
31 pages
Hashing new
No ratings yet
Hashing new
48 pages
Maps and Hashing - Final
No ratings yet
Maps and Hashing - Final
51 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing
No ratings yet
Hashing
66 pages
Hashing (1)
No ratings yet
Hashing (1)
31 pages
Lec12-Hash-Tables-09092024-090609pm (1)
No ratings yet
Lec12-Hash-Tables-09092024-090609pm (1)
48 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
HAshing (Satish sir)
No ratings yet
HAshing (Satish sir)
52 pages
Hashing
No ratings yet
Hashing
37 pages
Chapter10_HashTables
No ratings yet
Chapter10_HashTables
49 pages
Chapter 4 Hashing and File Structure
No ratings yet
Chapter 4 Hashing and File Structure
46 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
2,2Hashing
No ratings yet
2,2Hashing
30 pages
Topic 12 - Hashing
No ratings yet
Topic 12 - Hashing
30 pages
Hash Tables
No ratings yet
Hash Tables
33 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
What is Hashing
No ratings yet
What is Hashing
11 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
MODULE-5
No ratings yet
MODULE-5
33 pages
02 Hash Tables
No ratings yet
02 Hash Tables
21 pages
Modue 5
No ratings yet
Modue 5
10 pages
Hashing
No ratings yet
Hashing
23 pages
Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
Hashing
No ratings yet
Hashing
42 pages
ADS M TECH MID 2
No ratings yet
ADS M TECH MID 2
26 pages
Hashing
No ratings yet
Hashing
30 pages
Hashing
No ratings yet
Hashing
20 pages
Hash Tables
No ratings yet
Hash Tables
30 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Hashing
No ratings yet
Hashing
34 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
DSA MK Lect2 PDF
No ratings yet
DSA MK Lect2 PDF
92 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
CSE 326: Data Structures Hash Tables: Autumn 2007
No ratings yet
CSE 326: Data Structures Hash Tables: Autumn 2007
29 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing ClassNotes
No ratings yet
Hashing ClassNotes
8 pages
Hash Tables in DS
No ratings yet
Hash Tables in DS
14 pages
Ch7 Hashing
No ratings yet
Ch7 Hashing
12 pages
Chapter 8 - Searching
No ratings yet
Chapter 8 - Searching
44 pages
Algo Cha 8
No ratings yet
Algo Cha 8
20 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
32 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
No ratings yet
Done DS GTU Study Material Presentations Unit-4 13032021035653AM
24 pages
Unit-5 2
No ratings yet
Unit-5 2
9 pages
Algorithms & Data Structures 06
No ratings yet
Algorithms & Data Structures 06
13 pages
Hash PDF
No ratings yet
Hash PDF
7 pages
TCP2101 Algorithm Design & Analysis: - Hash Tables
No ratings yet
TCP2101 Algorithm Design & Analysis: - Hash Tables
58 pages
Important Definitions of a- Level Compter Science (9618)
No ratings yet
Important Definitions of a- Level Compter Science (9618)
25 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
Letter to Zonal Railways Regarding PO and TE Criteria of PAUT SRT Dated 12092024
No ratings yet
Letter to Zonal Railways Regarding PO and TE Criteria of PAUT SRT Dated 12092024
20 pages
PHP2
No ratings yet
PHP2
16 pages
Up Up and Array! Sanet - ST
100% (1)
Up Up and Array! Sanet - ST
235 pages
G11 Topic 4 - Computational Thinking (4)
No ratings yet
G11 Topic 4 - Computational Thinking (4)
76 pages
ch 1
No ratings yet
ch 1
70 pages
K Scheme Java IMP Questions by VJTech Academy
No ratings yet
K Scheme Java IMP Questions by VJTech Academy
66 pages
Madhav prgm file
No ratings yet
Madhav prgm file
104 pages
PPL UNIT 2 Notes
No ratings yet
PPL UNIT 2 Notes
66 pages
Address Calculation
No ratings yet
Address Calculation
11 pages
CS502 MidTerm MCQs by Talha Sajid
No ratings yet
CS502 MidTerm MCQs by Talha Sajid
45 pages
Chartis - RiskTech100 2024 - Publication Dec04
No ratings yet
Chartis - RiskTech100 2024 - Publication Dec04
34 pages
C# Module 2
No ratings yet
C# Module 2
59 pages
Instant Download Readings From Programming With C++ 1st Edition Kyla Mcmullen - Ebook PDF PDF All Chapters
100% (13)
Instant Download Readings From Programming With C++ 1st Edition Kyla Mcmullen - Ebook PDF PDF All Chapters
59 pages
Module 1
No ratings yet
Module 1
54 pages
Module 2 - Compensation and Benefits - HR Generalist Course
No ratings yet
Module 2 - Compensation and Benefits - HR Generalist Course
44 pages
JAVA WORKSHEET 2
No ratings yet
JAVA WORKSHEET 2
2 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Waveform 02 Data HC
No ratings yet
Waveform 02 Data HC
19 pages
Programming With C-1
No ratings yet
Programming With C-1
39 pages
Chapter 6 Arrays
No ratings yet
Chapter 6 Arrays
11 pages
Data - Structure Notes
No ratings yet
Data - Structure Notes
72 pages
Primitive Data Structure S: Introduction To Data Structures
No ratings yet
Primitive Data Structure S: Introduction To Data Structures
7 pages
Assignment#1 (2024)
No ratings yet
Assignment#1 (2024)
3 pages
Numpy Tutorial
No ratings yet
Numpy Tutorial
9 pages
FAF233 Lab3 PostoroncaDumitru10
No ratings yet
FAF233 Lab3 PostoroncaDumitru10
5 pages
Shared, Algorithms and Data Structures.
No ratings yet
Shared, Algorithms and Data Structures.
4 pages
Accenture Preparatory Material
No ratings yet
Accenture Preparatory Material
3 pages
C Programming MCQ PDF
No ratings yet
C Programming MCQ PDF
8 pages
Grade 12 - Data Handling Using Pandas 1-Worksheet 1
No ratings yet
Grade 12 - Data Handling Using Pandas 1-Worksheet 1
2 pages
Learn Programming Using C#
From Everand
Learn Programming Using C#
Taurius Litvinavicius
No ratings yet
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet

CSC508 Hashing

Uploaded by

CSC508 Hashing

Uploaded by

1

▪ The key is specific value associated with a

▪ Thus for key-value mod M, multiples of M give the same

table buckets obj1

int hashmethod(String insertKey)

Insert 2 Insert 21 Insert 34 Insert 54

▪ Hashing with Chaining (a.k.a. “Separate Chaining”):

▪ Hashing with Open Addressing: every hash table entry

Insert 54 other Insert 101

▪ Weiss Mark Allen, Data Structures & Algorithm Analysis in C++,

You might also like