0% found this document useful (0 votes)

9 views

Data Structures

The document discusses different data structures including hashing techniques like separate chaining and open addressing using linear probing and quadratic probing to handle collisions. It also discusses disjoint sets and how union-find algorithms are used to perform operations like find and union on disjoint sets.

Uploaded by

VIJAY V STUDENT -CSE DATASCIENCE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Data Structures

Uploaded by

VIJAY V STUDENT -CSE DATASCIENCE

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Page 1 of 6

DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION

DATA STRUCTURES
(this is an additional material and not exhaustive study material)

Indexing: Hashing - Hash Functions – Separate Chaining – Open Addressing: Linear Probing- Quadratic
Probing- Double Hashing- Rehashing – Extendible Hashing.
Disjoint Sets: Basic data structure - Smart Union Algorithms - Path Compression.

HASHING

Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an
array format, where each data value has its own unique index value. Access of data becomes very fast if we
know the index of the desired data.

Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size
of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an index where
an element is to be inserted or is to be located from.

In hashing, large keys are converted into small keys by using hash functions. The values are then stored in a
data structure called hash table. The idea of hashing is to distribute entries (key/value pairs) uniformly across
an array. Each element is assigned a key (converted key). By using that key you can access the element
in O(1) time. Using the key, the algorithm (hash function) computes an index that suggests where an entry can
be found or inserted.

Hashing is implemented in two steps:

1. An element is converted into an integer by using a hash function. This element can be used as an index
to store the original element, which falls into the hash table.
2. The element is stored in the hash table where it can be quickly retrieved using hashed key.

hash = hashfunc(key)

index = hash % array_size

In this method, the hash is independent of the array size and it is then reduced to an index (a number between
0 and array_size − 1) by using the modulo operator (%).

Hash function

A hash function is any function that can be used to map a data set of an arbitrary size to a data set of a fixed
size, which falls into the hash table. The values returned by a hash function are called hash values, hash codes,
hash sums, or simply hashes.

To achieve a good hashing mechanism, It is important to have a good hash function with the following basic
requirements:

1. Easy to compute: It should be easy to compute and must not become an algorithm in itself.
2. Uniform distribution: It should provide a uniform distribution across the hash table and should not result
in clustering.
Page 2 of 6
DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION
3. Less collisions: Collisions occur when pairs of elements are mapped to the same hash value. These
should be avoided.

Note: Irrespective of how good a hash function is, collisions are bound to occur. Therefore, to maintain
the performance of a hash table, it is important to manage collisions through various collision resolution
techniques.

Five popular hashing functions are as follows:-

Division Method: An integer key x is divided by the table size m and the remainder is taken as the hash
value. It can be defined as
H(x)=x%m+1
For example, x=42 and m=13, H(42)=45%13+1=3+1=4
Midsquare Method: A key is multiplied by itself and the hash value is obtained by selecting an appropriate
number of digits from the middle of the square. The same positions in the square must be used for all keys.
For example if the key is 12345,
square of this key is value 152399025. If 2 digit addresses is required then position 4th and 5th can be
chosen, giving address 39.
Folding Method: A key is broken into several parts. Each part has the same length as that of the required
address except the last part. The parts are added together, ignoring the last carry, we obtain the hash address
for key K.
Multiplicative method: In this method a real number c such that 0<c<1 is
selected. For a nonnegative integral key x, the hash function is defined as
H(x)=[m(cx%1)]+1
Here,cx%1 is the fractional part of cx and [] denotes the greatest integer less than or equal to its contents.
Digit Analysis: This method forms addresses by selecting and shifting digits of the original key. For a given
key set, the same positions in the key and same rearrangement pattern must be used. For example, a key
7654321 is transformed to the address 1247 by selecting digits in position 1,2,4 and 7 then by reversing their
order.

Collision resolution techniques

Separate chaining (open hashing)

Separate chaining is one of the most commonly used collision resolution techniques. It is usually implemented
using linked lists. In separate chaining, each element of the hash table is a linked list. To store an element in
the hash table you must insert it into a specific linked list. If there is any collision (i.e. two different elements
have same hash value) then store both the elements in the same linked list.
Page 3 of 6
DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION

The cost of a lookup is that of scanning the entries of the selected linked list for the required key. If the
distribution of the keys is sufficiently uniform, then the average cost of a lookup depends only on the average
number of keys per linked list. For this reason, chained hash tables remain effective even when the number of
table entries (N) is much higher than the number of slots.

For separate chaining, the worst-case scenario is when all the entries are inserted into the same linked list. The
lookup procedure may have to scan all its entries, so the worst-case cost is proportional to the number (N) of
entries in the table.

Linear probing (open addressing or closed hashing)

In open addressing, instead of in linked lists, all entry records are stored in the array itself. When a new entry
has to be inserted, the hash index of the hashed value is computed and then the array is examined (starting with
the hashed index). If the slot at the hashed index is unoccupied, then the entry record is inserted in slot at the
hashed index else it proceeds in some probe sequence until it finds an unoccupied slot.

The probe sequence is the sequence that is followed while traversing through entries. In different probe
sequences, you can have different intervals between successive entry slots or probes.

When searching for an entry, the array is scanned in the same sequence until either the target element is found
or an unused slot is found. This indicates that there is no such key in the table. The name "open addressing"
refers to the fact that the location or address of the item is not determined by its hash value.

Linear probing is when the interval between successive probes is fixed (usually to 1). Let’s assume that the
hashed index for a particular entry is index. The probing sequence for linear probing will be:

index = index % hashTableSize

index = (index + 1) % hashTableSize
index = (index + 2) % hashTableSize
index = (index + 3) % hashTableSize

and so on…
Page 4 of 6
DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION
- Array-based implementation.
- All elements stored in hash table itself.
- When collisions occur, use a systematic (consistent) procedure to store elements in free
slots of the table.
- Three Types of Open Addressing
(i) Linear probing (linear search)
- When collision occurs, scan down the array one cell at a time looking for an empty cell
- hi(X) = (Hash(X) + i) mod TableSize (i = 0, 1, 2, …)
- Compute hash value and increment it until a free cell is found

(ii) Quadratic probing (nonlinear search)

- Spread out the search for an empty slot –
Increment by i2 instead of i
- hi(X) = (Hash(X) + i2) % TableSize
- h0(X) = Hash(X) % TableSize
- h1(X) = Hash(X) + 1 % TableSize
- h2(X) = Hash(X) + 4 % TableSize
- h3(X) = Hash(X) + 9 % TableSize

(iii) Double hashing (uses two hash functions)

- Apply primary hash function
- If collision occurs then spread out the search for an empty slot by using a second hash function
- Example :
Page 5 of 6
DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION
Primary Hash Function Hash1(X)) = (X mod R)
Secondary Hash Function Hash2(X) = R – (X mod R)
where R is a prime smaller than TableSize

Disjoint Set :

A disjoint-set data structure is a data structure that keeps track of a set of elements partitioned into a number
of disjoint (non-overlapping) subsets. A union-find algorithm is an algorithm that performs two useful
operations on such a data structure:

Find: Determine which subset a particular element is in. This can be used for determining if two elements
are in the same subset.
Union: Join two subsets into a single subset.

Application of disjoint-set data structure is to check whether a given graph contains a cycle or not.
Union-Find Algorithm can be used to check whether an undirected graph contains cycle or not. This is
another method based on Union-Find. This method assumes that the graph doesn’t contain any self-loops.
To keep track of the subsets an array is used, call it parent[].
Let us consider the following graph:

For each edge, make subsets using both the vertices of the edge. If both the vertices are in the same subset, a
cycle is found.
Initially, all slots of parent array are initialized to -1 (means there is only one item in every subset).
0 1 2
-1 -1 -1
Now process all edges one by one.
Page 6 of 6
DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION
Edge 0-1: Find the subsets in which vertices 0 and 1 are. Since they are in different subsets, we take the
union of them. For taking the union, either make node 0 as parent of node 1 or vice-versa.

0 1 2  1 is made parent of 0 (1 is now representative of subset {0,1})

1 -1 -1

Edge 1-2: 1 is in subset 1 and 2 is in subset 2. So, take union.

0 1 2  2 is made parent of 1 (2 is now representative of subset {0,1,2})

1 2 -1

Edge 0-2: 0 is in subset 2 and 2 is also in subset 2. Hence, including this edge forms a cycle.

0->1->2 // 1 is parent of 0 and 2 is parent of 1

Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
Hashing
No ratings yet
Hashing
37 pages
CSD203 Hashing
No ratings yet
CSD203 Hashing
32 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
Hashing: Amar Jukuntla
No ratings yet
Hashing: Amar Jukuntla
22 pages
6 Dec. 24 Unit 5 DSA
No ratings yet
6 Dec. 24 Unit 5 DSA
56 pages
AR23 REC DS Unit-IV v2
No ratings yet
AR23 REC DS Unit-IV v2
26 pages
Hashing
No ratings yet
Hashing
23 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing
No ratings yet
Hashing
10 pages
Lab 09 - Hashing
No ratings yet
Lab 09 - Hashing
47 pages
DS - Unit 5 - Notes
No ratings yet
DS - Unit 5 - Notes
8 pages
Study_Material_on_Hashing
No ratings yet
Study_Material_on_Hashing
4 pages
Hashing: Data Structure
No ratings yet
Hashing: Data Structure
17 pages
IT T33-Data Structures: SMVEC - Department of Information Technology 1
No ratings yet
IT T33-Data Structures: SMVEC - Department of Information Technology 1
30 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Hashing PPT
No ratings yet
Hashing PPT
39 pages
Hashing
No ratings yet
Hashing
25 pages
Chapter One - Hashing PDF
No ratings yet
Chapter One - Hashing PDF
30 pages
Hash Tables in DS
No ratings yet
Hash Tables in DS
14 pages
unit 1 Hashing
No ratings yet
unit 1 Hashing
61 pages
HASHING
No ratings yet
HASHING
21 pages
Hash Functions
No ratings yet
Hash Functions
60 pages
Hashing
No ratings yet
Hashing
37 pages
Implementation Priority Queue Using Array
No ratings yet
Implementation Priority Queue Using Array
3 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing
No ratings yet
Hashing
57 pages
Ch7 Hashing
No ratings yet
Ch7 Hashing
12 pages
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
No ratings yet
AST20105 Data Structure and Algorithms: Chapter 9 - Hash Table
39 pages
Hashing
No ratings yet
Hashing
20 pages
Lec 11 Hashing and Collision
No ratings yet
Lec 11 Hashing and Collision
16 pages
Collision
No ratings yet
Collision
24 pages
DSA LABTASK 12
No ratings yet
DSA LABTASK 12
5 pages
Theory PDF
No ratings yet
Theory PDF
18 pages
11 Hashing
No ratings yet
11 Hashing
60 pages
ADI Hashing
No ratings yet
ADI Hashing
47 pages
Hashing Updated
No ratings yet
Hashing Updated
26 pages
Hashing Slide
No ratings yet
Hashing Slide
16 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Topic 1: Hashing - Introduction: Hashing Is A Method of Storing and Retrieving Data From A Database Efficiently
No ratings yet
Topic 1: Hashing - Introduction: Hashing Is A Method of Storing and Retrieving Data From A Database Efficiently
31 pages
Dsa Hashing (21CS32)
No ratings yet
Dsa Hashing (21CS32)
16 pages
Lab 2
No ratings yet
Lab 2
10 pages
Course7 Hashing
No ratings yet
Course7 Hashing
19 pages
DSA Chapter 08 (Searching)
No ratings yet
DSA Chapter 08 (Searching)
65 pages
ADS M TECH MID 2
No ratings yet
ADS M TECH MID 2
26 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Lecture 14 Hashing
No ratings yet
Lecture 14 Hashing
44 pages
Hashing
No ratings yet
Hashing
4 pages
Hashing
No ratings yet
Hashing
66 pages
Algo Cha 8
No ratings yet
Algo Cha 8
20 pages
Hashing
No ratings yet
Hashing
30 pages
Unit-5
No ratings yet
Unit-5
50 pages
08_Hashing.pptx
No ratings yet
08_Hashing.pptx
26 pages
Searching, Sorting and Hashing
No ratings yet
Searching, Sorting and Hashing
52 pages
Hashing 1
No ratings yet
Hashing 1
16 pages
Hashing
From Everand
Hashing
Prakash Hegade
No ratings yet
300+ Python Algorithms: Mastering the Art of Problem-Solving
From Everand
300+ Python Algorithms: Mastering the Art of Problem-Solving
Hernando Abella
5/5 (1)
Oracle Wait Event - Common Issues and Solutions
100% (1)
Oracle Wait Event - Common Issues and Solutions
7 pages
BTech CSE Syllabus June2023
No ratings yet
BTech CSE Syllabus June2023
132 pages
Query Execution
No ratings yet
Query Execution
25 pages
Python Basics
No ratings yet
Python Basics
13 pages
File Organisation and Indexing
No ratings yet
File Organisation and Indexing
10 pages
Unit - I PART-A (Two Marks Questions)
No ratings yet
Unit - I PART-A (Two Marks Questions)
6 pages
Python
100% (3)
Python
540 pages
Pattern 1: Sliding Window: Find Averages of Sub Arrays
No ratings yet
Pattern 1: Sliding Window: Find Averages of Sub Arrays
143 pages
dsa
No ratings yet
dsa
48 pages
Horizontal Movements of Frame Structures Induced by Vertical Loads
No ratings yet
Horizontal Movements of Frame Structures Induced by Vertical Loads
10 pages
VB Data Structure
No ratings yet
VB Data Structure
39 pages
Major Company Interview Questions
80% (5)
Major Company Interview Questions
100 pages
Microsoft - LeetCode
No ratings yet
Microsoft - LeetCode
14 pages
Ozone Architecture v1
No ratings yet
Ozone Architecture v1
11 pages
CE204 Data Structures and Algorithms Final Exam
No ratings yet
CE204 Data Structures and Algorithms Final Exam
2 pages
Efficient Detection of Java Deserialization Gadget Chains Via Bottom-Up Gadget Search and Dataflow-Aided Payload Construction
No ratings yet
Efficient Detection of Java Deserialization Gadget Chains Via Bottom-Up Gadget Search and Dataflow-Aided Payload Construction
18 pages
Lecture 01 - File Storage - Part 1
No ratings yet
Lecture 01 - File Storage - Part 1
48 pages
DSA Revision Guide
No ratings yet
DSA Revision Guide
102 pages
BSC CS Syllabus, Burdwan University
No ratings yet
BSC CS Syllabus, Burdwan University
17 pages
Hashing Notes
No ratings yet
Hashing Notes
5 pages
Data Dissemination
No ratings yet
Data Dissemination
46 pages
HashMap vs. ConcurrentHashMap vs. SynchronizedMap - How A HashMap Can Be Synchronized in Java - Crunchify
No ratings yet
HashMap vs. ConcurrentHashMap vs. SynchronizedMap - How A HashMap Can Be Synchronized in Java - Crunchify
9 pages
Dynamic Partitioning To Increase Parallelism in PowerCenter
No ratings yet
Dynamic Partitioning To Increase Parallelism in PowerCenter
3 pages
Notes
No ratings yet
Notes
14 pages
CS506 MIDTERM SOLVED MCQS BY JUNAID MALIK
No ratings yet
CS506 MIDTERM SOLVED MCQS BY JUNAID MALIK
53 pages
UNIT 5 File Organization in DBMS
No ratings yet
UNIT 5 File Organization in DBMS
22 pages
Dsa Path
No ratings yet
Dsa Path
5 pages
EXERCISE 6
No ratings yet
EXERCISE 6
4 pages
Documenting programming projects OCR H446 (non games projects)
No ratings yet
Documenting programming projects OCR H446 (non games projects)
58 pages
Missing Number
No ratings yet
Missing Number
10 pages

Data Structures

Uploaded by

Data Structures

Uploaded by

Page 1 of 6

DATA STRUCTURES NOTES FOR II SEM B.Tech – ‘B’ SECTION

Hashing is implemented in two steps:

index = hash % array_size

Five popular hashing functions are as follows:-

Collision resolution techniques

Separate chaining (open hashing)

Linear probing (open addressing or closed hashing)

index = index % hashTableSize

(ii) Quadratic probing (nonlinear search)

(iii) Double hashing (uses two hash functions)

0 1 2  1 is made parent of 0 (1 is now representative of subset {0,1})

Edge 1-2: 1 is in subset 1 and 2 is in subset 2. So, take union.

0 1 2  2 is made parent of 1 (2 is now representative of subset {0,1,2})

0->1->2 // 1 is parent of 0 and 2 is parent of 1

You might also like