DSA Hash

The document explains hash tables, focusing on their efficiency in searching, adding, and deleting data compared to arrays and linked lists. It details the process of building a hash table, including the use of hash functions, handling collisions, and the differences between hash sets and hash maps. Additionally, it discusses the implementation of hash sets and hash maps in Python, emphasizing the importance of effective hash functions for maintaining performance.

Uploaded by

Sameer Sohail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views26 pages

DSA Hash

Uploaded by

Sameer Sohail

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 26

DATA STRUCTURE AND

ALGORITHMS – HASH
TABLES
Dr. Muhammad Awais Sattar
Assistant Professor RSCI

1
HASH TABLE
 A Hash Table is a data structure designed to be fast to work with.
 The reason Hash Tables are sometimes preferred instead of arrays or linked
lists is because searching for, adding, and deleting data can be done really
quickly, even for large amounts of data.
 In a Linked List, finding a person "Bob" takes time because we would have to
go from one node to the next, checking each node, until the node with "Bob"
is found.
 And finding "Bob" in an Array could be fast if we knew the index, but when we
only know the name "Bob", we need to compare each element (like with
Linked Lists), and that takes time.
 With a Hash Table however, finding "Bob" is done really fast because there is
a way to go directly to where "Bob" is stored, using something called a hash
function.
BUILDING A HASH TABLE FROM
SCRATCH
 To get the idea of what a Hash Table is, let's try to build
one from scratch, to store unique first names inside it.
 We will build the Hash Set in 5 steps:
1. Starting with an array.
2. Storing names using a hash function.
3. Looking up an element using a hash function.
4. Handling collisions.
5. The basic Hash Set code example and simulation.
STEP 1: STARTING WITH AN ARRAY
 Using an array, we could store names like this:
my_array = ['Pete', 'Jones', 'Lisa', 'Bob', 'Siri’]
 To find "Bob" in this array, we need to compare each name, element by
element, until we find "Bob".
 If the array was sorted alphabetically, we could use Binary Search to find a
name quickly, but inserting or deleting names in the array would mean a
big operation of shifting elements in memory.
 To make interacting with the list of names really fast, let's use a Hash Table
for this instead, or a Hash Set, which is a simplified version of a Hash Table.
 To keep it simple, let's assume there is at most 10 names in the list, so the
array must be a fixed size of 10 elements. When talking about Hash Tables,
each of these elements is called a bucket.
my_hash_set = [None,None,None,None,None,None,None,None,None,None]
STEP 2: STORING NAMES USING A
HASH FUNCTION
 Now comes the special way we interact with the Hash Set we
are making.
 We want to store a name directly into its right place in the
array, and this is where the hash function comes in.
 A hash function can be made in many ways, it is up to the
creator of the Hash Table. A common way is to find a way to
convert the value into a number that equals one of the Hash
Set's index numbers, in this case a number from 0 to 9. In our
example we will use the Unicode number of each character,
summarize them and do a modulo 10 operation to get index
numbers 0-9.
STEP 2: STORING NAMES USING A
HASH FUNCTION
def hash_function(value):
sum_of_chars = 0
for char in value:
sum_of_chars += ord(char)

return sum_of_chars % 10

print("'Bob' has hash code:",hash_function('Bob'))

• The character "B" has Unicode code point 66, "o" has 111, and "b" has 98.
Adding those together we get 275. Modulo 10 of 275 is 5, so "Bob" should be
stored as an array element at index 5.

• The number returned by the hash function is called the hash code.
STEP 2: STORING NAMES USING A
HASH FUNCTION
 After storing "Bob" where the hash code tells us (index
5), our array now looks like this:
my_hash_set= [None,None,None,None,None,'Bob',None,None,None,None]

 We can use the hash function to find out where to store

the other names "Pete", "Jones", "Lisa", and "Siri" as
well.
 After using the hash function to store those names in the
correct position, our array looks like this:
my_hash_set = [None,'Jones',None,'Lisa',None,'Bob',None,'Siri','Pete',None]
STEP 3: LOOKING UP A NAME
USING A HASH FUNCTION
 We have now established a super basic Hash Set,
because we do not have to check the array element by
element anymore to find out if "Pete" is in there, we can
just use the hash function to go straight to the right
element!
 To find out if "Pete" is stored in the array, we give the
name "Pete" to our hash function, we get back hash
code 8, we go directly to the element at index 8, and
there he is. We found "Pete" without checking any other
elements.
STEP 3: LOOKING UP A NAME
USING A HASH
my_hash_setFUNCTION
=
[None,'Jones',None,'Lisa',None,'Bob',None,'
Siri','Pete',None]

def hash_function(value):
sum_of_chars = 0
for char in value:
sum_of_chars += ord(char)

return sum_of_chars % 10

def contains(name):
index = hash_function(name)
return my_hash_set[index] == name

print("'Pete' is in the Hash

Set:",contains('Pete'))
When deleting a name from our Hash Set, we can also use the hash function to go straight to
where the name is, and set that element value to None.
STEP 4: HANDLING COLLISIONS
 Let's also add "Stuart" to our Hash Set.
 We give "Stuart" to our hash function, and we get the hash code 3,
meaning "Stuart" should be stored at index 3.
 Trying to store "Stuart" creates what is called a collision, because "Lisa"
is already stored at index 3.
 To fix the collision, we can make room for more elements in the same
bucket, and solving the collision problem in this way is called chaining.
We can give room for more elements in the same bucket by
implementing each bucket as a linked list, or as an array.
 After implementing each bucket as an array, to give room for potentially
more than one name in each bucket, "Stuart" can also be stored at
index 3, and our Hash Set now looks like this:
STEP 4: HANDLING COLLISIONS
 Searching for "Stuart" in our Hash my_hash_set = [
[None],
Set now means that using the hash ['Jones'],
[None],
function we end up directly in ['Lisa', 'Stuart'],
bucket 3, but then be must first [None],
['Bob'],
check "Lisa" in that bucket, before [None],
we find "Stuart" as the second ['Siri'],
['Pete'],
element in bucket 3. [None]
]
STEP 5: HASH SET CODE EXAMPLE
AND SIMULATION
 To complete our very basic Hash Set code, let's have
functions for adding and searching for names in the Hash
Set, which is now a two dimensional array.

 Run the code example below, and try it with different

values to get a better understanding of how a Hash Set
works.
STEP 5: HASH SET CODE EXAMPLE
AND SIMULATION
HASH TABLES SUMMARIZED
 Hash Table elements are stored in storage containers called buckets.
 Every Hash Table element has a part that is unique that is called the key.
 A hash function takes the key of an element to generate a hash code.
 The hash code says what bucket the element belongs to, so now we can go
directly to that Hash Table element: to modify it, or to delete it, or just to check
if it exists. Specific hash functions are explained in detail on the next two pages.
 A collision happens when two Hash Table elements have the same hash code,
because that means they belong to the same bucket. A collision can be solved
in two ways.
 Chaining is the way collisions are solved in this tutorial, by using arrays or
linked lists to allow more than one element in the same bucket.
 Open Addressing is another way to solve collisions. With open addressing, if we
want to store an element but there is already an element in that bucket, the
element is stored in the next available bucket. This can be done in many
different ways, but we will not explain open addressing any further here.
HASH SETS
 A Hash Set is a form of Hash Table data structure that
usually holds a large number of elements.
 Using a Hash Set we can search, add, and remove
elements really fast.
 Hash Sets are used for lookup, to check if an element is
part of a set.
HASH SETS
 A Hash Set stores unique elements in buckets according to the
element's hash code.
 Hash code: A number generated from an element's unique value
(key), to determine what bucket that Hash Set element belongs to.
 Unique elements: A Hash Set cannot have more than one element
with the same value.
 Bucket: A Hash Set consists of many such buckets, or containers,
to store elements. If two elements have the same hash code, they
belong to the same bucket. The buckets are therefore often
implemented as arrays or linked lists, because a bucket needs to
be able to hold more than one element.
FINDING THE HASH CODE
 A hash code is generated by a hash function.
 After that, the hash function does a modulo 10 operation (%
10) on the sum of characters to get the hash code as a
number from 0 to 9.
 This means that a name is put into one of ten possible
buckets in the Hash Set, according to the hash code of that
name. The same hash code is generated and used when we
want to search for or remove a name from the Hash Set.
 The Hash Code gives us instant access as long as there is
just one name in the corresponding bucket.
DIRECT ACCESS IN HASH SETS
 Searching for Peter in the Hash Set above, means that the hash code 2 is generated
(512 % 10), and that directs us right to the bucket Peter is in. If that is the only name
in that bucket, we will find Peter right away.
 In cases like this we say that the Hash Set has constant time O(1) for searching,
adding, and removing elements, which is really fast.
 But, if we search for Jens, we need to search through the other names in that bucket
before we find Jens. In a worst case scenario, all names end up in the same bucket,
and the name we are searching for is the last one. In such a worst case scenario the
Hash Set has time complexity O(n) which is the same time complexity as arrays and
linked lists.
 To keep Hash Sets fast, it is therefore important to have a hash function that will
distribute the elements evenly between the buckets, and to have around as many
buckets as Hash Set elements.
 Having a lot more buckets than Hash Set elements is a waste of memory, and having a
lot less buckets than Hash Set elements is a waste of time
HASH SET IMPLEMENTATION
 Hash Sets in Python are typically done by using Python's own
set data type, but to get a better understanding of how Hash
Sets work we will not use that here.
 To implement a Hash Set in Python we create a class
SimpleHashSet.
 Inside the SimpleHashSet class we have a method __init__ to
initialize the Hash Set, a method hash_function for the hash
function, and methods for the basic Hash Set operations: add,
contains, and remove.
 We also create a method print_set to better see how the Hash
Set looks like.
HASH SET IMPLEMENTATION
HASH MAPS
 A Hash Map is a form of Hash Table data structure that
usually holds a large number of entries.
 Using a Hash Map we can search, add, modify, and remove
entries really fast.
 Hash Maps are used to find detailed information about
something.
 In the simulation below, people are stored in a Hash Map.
A person can be looked up using a person's unique social
security number (the Hash Map key), and then we can see
that person's name (the Hash Map value).
HASH MAPS
 It is easier to understand how Hash Maps work if you first have a look
at the two previous pages about Hash Tables and Hash Sets. It is also
important to understand the meaning of the words below.
• Entry: Consists of a key and a value, forming a key-value pair.
• Key: Unique for each entry in the Hash Map. Used to generate a hash code
determining the entry's bucket in the Hash Map. This ensures that every entry
can be efficiently located.
• Hash Code: A number generated from an entry's key, to determine what bucket
that Hash Map entry belongs to.
• Bucket: A Hash Map consists of many such buckets, or containers, to store
entries.
• Value: Can be nearly any kind of information, like name, birth date, and address
of a person. The value can be many different kinds of information combined.
FINDING THE HASH CODE
 A hash code is generated by a hash function.
 The hash function in the simulation above takes the numbers in the social security
number (not the dash), add them together, and does a modulo 10 operation (%
10) on the sum of characters to get the hash code as a number from 0 to 9.
 This means that a person is stored in one of ten possible buckets in the Hash Map,
according to the hash code of that person's social security number. The same
hash code is generated and used when we want to search for or remove a person
from the Hash Map.
 The Hash Code gives us instant access as long as there is just one person in the
corresponding bucket.
 In the simulation above, Charlotte has social security number 123-4567. Adding
the numbers together gives us a sum 28, and modulo 10 of that is 8. That is why
she belongs to bucket 8.
DIRECT ACCESS IN HASH MAPS
 Searching for Charlotte in the Hash Map, we must use the social security number 123-4567 (the
Hash Map key), which generates the hash code 8, as explained above.
 This means we can go straight to bucket 8 to get her name (the Hash Map value), without
searching through other entries in the Hash Map.
 In cases like this we say that the Hash Map has constant time O(1) for searching, adding, and
removing entries, which is really fast compared to using an array or a linked list.
 But, in a worst case scenario, all the people are stored in the same bucket, and if the person we
are trying to find is last person in this bucket, we need to compare with all the other social
security numbers in that bucket before we find the person we are looking for.
 In such a worst case scenario the Hash Map has time complexity O(n), which is the same time
complexity as arrays and linked lists.
 To keep Hash Maps fast, it is therefore important to have a hash function that will distribute the
entries evenly between the buckets, and to have around as many buckets as Hash Map entries.
 Having a lot more buckets than Hash Map entries is a waste of memory, and having a lot less
buckets than Hash Map entries is a waste of time.
HASH MAP IMPLEMENTATION
 Hash Maps in Python are typically done by using Python's own
dictionary data type, but to get a better understanding of how
Hash Maps work we will not use that here.
 To implement a Hash Map in Python we create a class
SimpleHashMap.
 Inside the SimpleHashMap class we have a method __init__ to
initialize the Hash Map, a method hash_function for the hash
function, and methods for the basic Hash Map operations: put,
get, and remove.
 We also create a method print_map to better see how the Hash
Map looks like.
HASH MAP IMPLEMENTATION

Module 5
No ratings yet
Module 5
72 pages
Dsa Merged
No ratings yet
Dsa Merged
339 pages
UNIT 1 - Hashing
No ratings yet
UNIT 1 - Hashing
118 pages
Lecture05 Hash Table
No ratings yet
Lecture05 Hash Table
65 pages
Hashing Unit 1
No ratings yet
Hashing Unit 1
91 pages
Unit 5
No ratings yet
Unit 5
50 pages
Hash Table
No ratings yet
Hash Table
24 pages
CH8 Hashing
No ratings yet
CH8 Hashing
110 pages
09 Hashtable
No ratings yet
09 Hashtable
53 pages
Maps and Hashing - Final
No ratings yet
Maps and Hashing - Final
51 pages
Weeks 10, 11 - Sessions 19, 20, 21, 22 - Chapter HashTables
No ratings yet
Weeks 10, 11 - Sessions 19, 20, 21, 22 - Chapter HashTables
90 pages
14 Hashing
No ratings yet
14 Hashing
61 pages
Lecture 8 Hashing
No ratings yet
Lecture 8 Hashing
47 pages
Lecture 13 - Hash Tables
No ratings yet
Lecture 13 - Hash Tables
51 pages
Ceng2001 Week7
No ratings yet
Ceng2001 Week7
52 pages
Hashing RPK
No ratings yet
Hashing RPK
61 pages
Dsa 4
No ratings yet
Dsa 4
55 pages
06 - APS - Hash Table
No ratings yet
06 - APS - Hash Table
28 pages
Hash Table
No ratings yet
Hash Table
68 pages
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
No ratings yet
MODULE 5 - BCS304 - HASHING - Leftisht Trees - OBST - Notes
32 pages
Hash Table
No ratings yet
Hash Table
22 pages
13 Hashing
No ratings yet
13 Hashing
26 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Unit III-Hashing
100% (1)
Unit III-Hashing
135 pages
Hash Tables
No ratings yet
Hash Tables
35 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing
No ratings yet
Hashing
19 pages
CSC 302 - Hashing Techniques
No ratings yet
CSC 302 - Hashing Techniques
19 pages
CH 4
No ratings yet
CH 4
58 pages
Hashing
No ratings yet
Hashing
23 pages
Dsa Hashing (21CS32)
No ratings yet
Dsa Hashing (21CS32)
16 pages
Hash Table Data Structure
No ratings yet
Hash Table Data Structure
34 pages
Hashing
No ratings yet
Hashing
44 pages
Hashing Techniques
No ratings yet
Hashing Techniques
15 pages
New Microsoft Word Document
No ratings yet
New Microsoft Word Document
6 pages
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
No ratings yet
CHAPTER 8 Hashing: Instructors: C. Y. Tang and J. S. Roger Jang
78 pages
Hashing
No ratings yet
Hashing
23 pages
9.map 1 HashTable
No ratings yet
9.map 1 HashTable
31 pages
CSE220 Lab 4-Hashing
No ratings yet
CSE220 Lab 4-Hashing
7 pages
GROUP 15.Pptx Presentation
No ratings yet
GROUP 15.Pptx Presentation
29 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Hashing Interactivepy
No ratings yet
Hashing Interactivepy
11 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
43 pages
DSA Lab 11 Hashing
No ratings yet
DSA Lab 11 Hashing
9 pages
Maestro XS Reference Manual Version 2.0 PDF
33% (3)
Maestro XS Reference Manual Version 2.0 PDF
130 pages
Hashing in Data Structure
No ratings yet
Hashing in Data Structure
25 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
Hashing PDF
No ratings yet
Hashing PDF
56 pages
C++ Review (Ch. 1) Algorithm Analysis (Ch. 2) : Sets With Insert/delete/member: Hashing (Ch. 5)
No ratings yet
C++ Review (Ch. 1) Algorithm Analysis (Ch. 2) : Sets With Insert/delete/member: Hashing (Ch. 5)
42 pages
Hashing
No ratings yet
Hashing
8 pages
Hashing
No ratings yet
Hashing
56 pages
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
No ratings yet
Introduction To Hashing & Hashing Techniques: Review of Searching Techniques
19 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
The Chevalley-Warning Theorem (Featuring. - . The Erd Os-Ginzburg-Ziv Theorem)
No ratings yet
The Chevalley-Warning Theorem (Featuring. - . The Erd Os-Ginzburg-Ziv Theorem)
14 pages
Unit28 Hashing1
No ratings yet
Unit28 Hashing1
19 pages
MIT6 006F11 Lec08 PDF
No ratings yet
MIT6 006F11 Lec08 PDF
7 pages
Hashing and Indexing
No ratings yet
Hashing and Indexing
28 pages
Service Manual: Air Conditioner Split Type AMB 891/G
100% (1)
Service Manual: Air Conditioner Split Type AMB 891/G
5 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
FT (06) - Answerkey (RM) Phase02
No ratings yet
FT (06) - Answerkey (RM) Phase02
22 pages
Permutation
No ratings yet
Permutation
91 pages
Using The Leica TC 307 v2
No ratings yet
Using The Leica TC 307 v2
4 pages
Vibration DNV
100% (1)
Vibration DNV
10 pages
Water Well Drilling Machine and Tools Catalogue
No ratings yet
Water Well Drilling Machine and Tools Catalogue
49 pages
82bace127438068b8ebe
No ratings yet
82bace127438068b8ebe
73 pages
Unit-1 Lesson 1
No ratings yet
Unit-1 Lesson 1
10 pages
Din 653
No ratings yet
Din 653
5 pages
Iict Week 2
No ratings yet
Iict Week 2
54 pages
Rust Experimental v2017 DevBlog 179 x64 #KnightsTable
No ratings yet
Rust Experimental v2017 DevBlog 179 x64 #KnightsTable
2 pages
DSA LinkedList
No ratings yet
DSA LinkedList
30 pages
Clay Shale
No ratings yet
Clay Shale
22 pages
Iict Week 1
No ratings yet
Iict Week 1
16 pages
Aircraft Welding Cabriana
No ratings yet
Aircraft Welding Cabriana
5 pages
Sol 5
No ratings yet
Sol 5
7 pages
Simulation of Pre-Stressed Slabs Using Abaqus CDP Material Model
No ratings yet
Simulation of Pre-Stressed Slabs Using Abaqus CDP Material Model
10 pages
CSS Basics
No ratings yet
CSS Basics
15 pages
CST2355 Lab2b Summer2024
No ratings yet
CST2355 Lab2b Summer2024
9 pages
Chapter 7 (Part I) - User Defined Datatypes
No ratings yet
Chapter 7 (Part I) - User Defined Datatypes
53 pages
HTML Basics
No ratings yet
HTML Basics
19 pages
Everhard™: Abrasion-Resistant Steel Plate
No ratings yet
Everhard™: Abrasion-Resistant Steel Plate
12 pages
9 Fraunhofer Snail Trails
No ratings yet
9 Fraunhofer Snail Trails
4 pages
Chapter3 Electrochemistyry
No ratings yet
Chapter3 Electrochemistyry
2 pages
Monitoring and Diagnosing Networks: Dr. Jasim Saeed
No ratings yet
Monitoring and Diagnosing Networks: Dr. Jasim Saeed
11 pages
Parameter Estimation of A Plucked String Synthesis Model Using A Genetic Algorithm With Perceptual Fitness Calculation
No ratings yet
Parameter Estimation of A Plucked String Synthesis Model Using A Genetic Algorithm With Perceptual Fitness Calculation
15 pages
Ielts
No ratings yet
Ielts
2 pages
Caotic Mechanics Maxima
No ratings yet
Caotic Mechanics Maxima
25 pages
Iot Based Garbage Management System For Smart City Using Raspberry Pi
No ratings yet
Iot Based Garbage Management System For Smart City Using Raspberry Pi
10 pages
Soal Pts GASAL 2023
No ratings yet
Soal Pts GASAL 2023
1 page
SAMPLING and SAMPLING DISTRIBUTIONS (With Key)
No ratings yet
SAMPLING and SAMPLING DISTRIBUTIONS (With Key)
5 pages
TC 20140501 0022-Desbloqueado PDF
No ratings yet
TC 20140501 0022-Desbloqueado PDF
5 pages
Circuit Explanation of 4 Channel Adapter For The Oscilloscope
No ratings yet
Circuit Explanation of 4 Channel Adapter For The Oscilloscope
4 pages
Strength of Materials Math Worksheet: Answers
No ratings yet
Strength of Materials Math Worksheet: Answers
2 pages
Data Structures II Essentials
From Everand
Data Structures II Essentials
Dennis C. Smolarski
No ratings yet
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet

DSA Hash

Uploaded by

DSA Hash

Uploaded by

DATA STRUCTURE AND

print("'Bob' has hash code:",hash_function('Bob'))

 We can use the hash function to find out where to store

print("'Pete' is in the Hash

 Run the code example below, and try it with different

You might also like