0% found this document useful (0 votes)
9 views14 pages

Unit 5 - DSA

The document provides an overview of searching algorithms, including Linear Search, Binary Search, and Hashing. It details their algorithms, implementations, time complexities, advantages, and disadvantages. Additionally, it discusses collision handling techniques in hashing and provides example code for both linear and quadratic probing methods.

Uploaded by

s08112003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views14 pages

Unit 5 - DSA

The document provides an overview of searching algorithms, including Linear Search, Binary Search, and Hashing. It details their algorithms, implementations, time complexities, advantages, and disadvantages. Additionally, it discusses collision handling techniques in hashing and provides example code for both linear and quadratic probing methods.

Uploaded by

s08112003
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

PSNACOLLEGE OF ENGINEERINGANDTECHNOLOGY,DINDIGUL–624622.

(An Autonomous Institution Affiliated to Anna University, Chennai)


CB2O12 – DATA STRUCTURES AND ALGORITHMS
UNIT 5 – SEARCHING
LINEAR SEARCH:
 Linear search is a simple searching algorithm used to find an element in a list or
array.
 It works by checking each element of the list one by one until the desired element is
found or the list ends.
Algorithm:
Step 1: Start from the first element of the array.
Step 2: Compare the current element with the target element.
Step 3: If the current element matches the target, return its index (position).
Step 4: If not, move to the next element.
Step 5: Repeat steps 2–4 until the end of the array.
Step 6: If the element is not found, return -1 (or indicate that the element is absent).
Implementation:

1 Prepared by M.Kamarajan, AP/CSE, PSNACET


Example:
deflinear_search (arr, target):
for i in range (len (arr)):
if arr [i] == target:
return i
return -1
# Example usage
arr = [10, 20, 30, 40, 50]
target = 30
result = linear_search (arr, target)
if result != -1:
print (f“Element found at index {result}”)
else:
print (“Element not found”)
Output:
Element found at index 2
Time Complexity of Linear Search:
 Best case: O (1)
 Average case: O (n)
 Worst-case: O (n)
Space Complexity of Bubble Sort: O (n)
2 Prepared by M.Kamarajan, AP/CSE, PSNACET
Advantages:
 Simple & easy to implement.
 Works on any data structure.
 No extra memory required.
 Best for small data sets.
Disadvantages:
 Slow for large data sets.
 Not optimized for sorted data.
 High time complexity.
BINARY SEARCH:
 Binary Search is an efficient searching algorithm used to find an element in a
sorted array.
 It follows the divide-and-conquer approach by repeatedly dividing the search range
in half.
Algorithm:
Step 1: Set low = 0 and high = length of array - 1.
Step 2: Repeat until low is less than or equal to high:
 Find the middle index: mid = (low + high) // 2.
 If arr[mid] == target, return mid (element found).
 If arr[mid] < target, update low = mid + 1 (search in the right half).
 If arr[mid] > target, update high = mid - 1 (search in the left half).
Step 3: If the element is not found, return -1.
Implementation:

3 Prepared by M.Kamarajan, AP/CSE, PSNACET


Example:
defbinary_search (arr, target):
low, high = 0, len (arr) – 1
while low <= high:
mid = (low + high) // 2
if arr [mid] == target:
return mid
elifarr[ mid] < target:
low = mid + 1
else:
high = mid - 1
return -1
# Example Usage
arr = [10, 20, 30, 40, 50]
target = 30
4 Prepared by M.Kamarajan, AP/CSE, PSNACET
result = binary_search(arr, target)
if result != -1:
print (f “Element found at index {result}”)
else:
print(“Element not found”)
Output:
Element found at index 2
Time Complexity of Binary Search:
 Best case: O (1)
 Average case: O (log n)
 Worst-case: O (log n)
Space Complexity of Bubble Sort: O (n)
Advantages:
 Faster than linear search.
 Efficient for large datasets.
 Fewer comparisons.
 Works on various data types.
Disadvantages:
 Only works on sorted data.
 More complex implementation.
 Not suitable for small lists.
 Doesn't work well on dynamic data.
HASHING:
 Hashing is a technique to convert a range of key values into a range of indexes of an
array.
 The idea is to use hash function that converts a given phone number or any other
key to a smaller number and uses the small number as index in a table called hash
table.
 Hashing is the solution that can be used in almost all such situations and performs
extremely well compared to other data structures like Array, Linked List and AVL.
 With hashing we get O (1) search time on average and O (n) in worst case.
Example:
 Suppose the manufacturing company has an inventory file that consists of less
than 1000 parts.
 Each part is having unique 7-digit number.
 The number is called ‘key’ and the particular keyed record consists of that part
name.
 If there are less than 1000 parts then a 1000 element array can be used to store the
complete file.
 Such an array will be indexed from 0 to 999.
 Since the key number is 7 digit it is converted to 3 digits by taking only last three
digits of a key.
 This is shown in the figure below.

5 Prepared by M.Kamarajan, AP/CSE, PSNACET


 Observe in the above figure that the first key 496700 and it is stored at 0th
position.
 The second key is 8421002.
 The last three digits indicate the position 2nd in the array.
 Let us search the element 4957397.
 Naturally it will be obtained at position 397.
 This method of searching is called hashing.

6 Prepared by M.Kamarajan, AP/CSE, PSNACET


 The function that converts the key (7 digit) into array position is called hash
function.
 Here hash function is
h (key) = key 1000
 Where key % 1000 will be the hash function and the key obtained by hash function
is called hash key.
HASH FUNCTIONS:
 A function that maps a big number to a small integer that can be used as index in
hash table.
 Hash function is a function used to place data in hash table.
 Similarly hash function is used to retrieve data from hash table.
 Thus, the use of hash function is to implement hash table.
Hash Table:
 Hash table is a data structure used for storing and retrieving data quickly.
 Every entry in the hash table is made using hash function.
Example:
 Consider hash function as key mod 5.
 The hash table of size 5.

Bucket:
 The hash function h (key) is used to map several dictionary entries in the hash
table.
 Each position of the hash table is called bucket.
Collision:
 The situation where a newly inserted key maps to an already occupied slot in hash
table is called collision.
 Collision is situation in which hash function returns the same address for more
than one record.

7 Prepared by M.Kamarajan, AP/CSE, PSNACET


Collision Handling Techniques:
1. Separate Chaining
2. Open Addressing
a. Linear Probing
b. Quadratic Probing
c. Double Hashing
3. Rehashing
4. Extendible Hashing
1. SEPARATE CHAINING:
 The idea is to make each cell of hash table point to a linked list of records that
have same hash function value.
 Let us consider a simple hash function as “key mod 7” and sequence of keys as
50, 700, 76, 85, 92, 73, 101.

Advantages:
 Simple to implement.
 Hash table never fills up, we can always add more elements to chain.
 Less sensitive to the hash function or load factors.
 It is mostly used when it is unknown how many and how frequently keys may be
inserted or deleted.
Disadvantages:
 Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in same
table.
 Wastage of Space (Some Parts of hash table are never used).
 If the chain becomes long, then search time can become O (n) in worst case.
 Uses extra space for links.
2. OPEN ADDRESSING:
 In open addressing, all elements are stored in the hash table itself. So at any
point, size of the table must be greater than or equal to the total number of keys.
 Insert (k): Keep probing until an empty slot is found. Once an empty slot is
found, insert k.
 Search (k): Keep probing until slot’s key doesn’t become equal to k or an empty
slot is reached.
 Delete (k): Delete operation is interesting. If we simply delete a key, then search
may fail. So slots of deleted keys are marked specially as “deleted”.

8 Prepared by M.Kamarajan, AP/CSE, PSNACET


 Insert can insert an item in a deleted slot, but the search doesn’t stop at a deleted
slot.
a. LINEAR PROBING:
 In linear probing, we linearly probe for next slot.
 For example, typical gap between two probes is 1 as taken in below example
also.
 Let hash (x) be the slot index computed using hash function and S be the table
size.
 If slot hash (x) % S is full, then we try (hash (x) + 1) % S.
 If (hash (x) + 1) % S is also full, then we try (hash (x) + 2) % S.
 If (hash (x) + 2) % S is also full, then we try (hash (x) + 3) % S.
Example:
 Let us take hash function as “key mod 7” and sequence of keys as 50, 700, 76,
85, 92, 73, 101.

 Clustering: The main problem with linear probing is clustering, many consecutive
elements form groups and it starts taking time to find a free slot or to search an
element.

9 Prepared by M.Kamarajan, AP/CSE, PSNACET


Program:
TABLE_SIZE = 7 find ()
hash_table = [0] * TABLE_SIZE elif opt == 4:
print (“Exiting the program.”)
def insert ():
break
key = int (input (“Enter a value to insert
else:
into the hash table: ”))
print (“Invalid choice!”)
hkey = key % TABLE_SIZE
for i in range (TABLE_SIZE): if __name__ == “__main__”:
index = (hkey + i) % TABLE_SIZE main ()
if hash_table [index] == 0: Output:
hash_table [index] = key LINEAR PROBING
print (f “Element {key} inserted at 1. INSERT
index {index}”) 2. DISPLAY
return 3. FIND
print (“\nElement cannot be inserted!”) 4. EXIT
Enter the choice: 1
def find ():
Enter a value to insert into hash table: 50
key = int (input (“Enter an element to
Enter the choice: 1
find: ”))
Enter a value to insert into hash table:
hkey = key % TABLE_SIZE
700
for i in range (TABLE_SIZE):
Enter the choice: 1
index = (hkey + i) % TABLE_SIZE
Enter a value to insert into hash table: 76
if hash_table [index] == key:
Enter the choice: 1
print (f “Value is found at index:
Enter a value to insert into hash table: 85
{index}”)
Enter the choice: 1
return
Enter a value to insert into hash table: 92
print (“Value is not found!”)
Enter the choice: 1
def display (): Enter a value to insert into hash table: 73
print (“\nElements in the hash table are: Enter the choice: 1
”) Enter a value to insert into hash table:
for i in range(TABLE_SIZE): 101
print (f “At Index: {i} \t Value = Enter the choice: 2
{hash_table[i]}”) Elements in the hash table are:
print() At Index: 0 Value = 700
def main (): At Index: 1 Value = 50
while True: At Index: 2 Value = 85
print (“\nLINEAR PROBING”) At Index: 3 Value = 92
At Index: 4 Value = 73
print (“1. INSERT”)
print (“2. DISPLAY”) At Index: 5 Value = 101
print (“3. FIND”) At Index: 6 Value = 76
print (“4. EXIT”) Enter the choice: 3
opt = int (input (“Enter the choice: ”)) Enter an element to find: 73
Value is found at index: 4
if opt == 1: Enter the choice: 4
insert ()
elif opt == 2:
display ()
elif opt == 3:
10 Prepared by M.Kamarajan, AP/CSE, PSNACET
b. QUADRATIC PROBING:
 We look for i2th slot in ith iteration.
 Let hash (x) be the slot index computed using hash function.
 If slot hash (x) % S is full, then we try (hash (x) + 1*1) % S.
 If (hash (x) + 1*1) % S is also full, then we try (hash (x) + 2*2) % S.
 If (hash (x) + 2*2) % S is also full, then we try (hash (x) + 3*3) % S.
Example:
 Let us take hash function as “key mod 7” and sequence of keys as 50, 700, 76,
85, 92, 73, 101.

Program:
# Define the size of the hash table def display():
TABLE_SIZE = 7 # Display the elements in the hash table
# Initialize the hash table with None values print(“\nElements in the hash table
h = [None] * TABLE_SIZE are:”)
for i in range(TABLE_SIZE):
def insert():
print(f “At Index {i} \t value = {h[i]}”)
# Get the value to insert
print(“\n”)
key = int(input “Enter a value to insert
def main():
into hash table: ”))
while True:
hkey = key % TABLE_SIZE
print(“\nQUADRATIC PROBING”)
# Try to insert the key using quadratic print(“1. INSERT”)
probing print(“2. DISPLAY”)
for i in range(TABLE_SIZE): print(“3. FIND”)
index = (hkey + i * i) % TABLE_SIZE print(“4. EXIT”)
if h[index] is None:
opt = int(input(“Enter your choice: ”))
for i in range(TABLE_SIZE):
if opt == 1:
index = (hkey + i * i) % TABLE_SIZE
insert()
if h[index] == key:
elif opt == 2:
print(f “Value {key} is found at
display()
index {index}.”)
elif opt == 3:
break
find()
else:
elif opt == 4:
print(“\nValue is not found!”)
exit(0)

11 Prepared by M.Kamarajan, AP/CSE, PSNACET


else: Enter the choice: 1
print(“Invalid choice. Please try Enter a value to insert into hash table: 92
again.”) Enter the choice: 1
if __name__ == “__main__”: Enter a value to insert into hash table: 73
main() Enter the choice: 1
Output: Enter a value to insert into hash table: 101
QUADRATIC PROBING Enter the choice: 2
1. INSERT Elements in the hash table are:
2. DISPLAY At Index: 0 Value = 700
3. FIND At Index: 1 Value = 50
4. EXIT At Index: 2 Value = 85
Enter the choice: 1 At Index: 3 Value = 73
Enter a value to insert into hash table: 50 At Index: 4 Value = 101
Enter the choice: 1 At Index: 5 Value = 92
Enter a value to insert into hash table: 700 At Index: 6 Value = 76
Enter the choice: 1 Enter the choice: 3
Enter a value to insert into hash table: 76 Enter an element to find: 92
Enter the choice: 1 Value is found at index: 5
Enter a value to insert into hash table: 85 Enter the choice: 4

c. DOUBLE HASHING:
 We use another hash function hash2 (x) and look for i*hash2 (x) slot in ith
rotation.
 Let hash(x) be the slot index computed using hash function.
 If slot hash (x) % S is full, then we try (hash (x) + 1*hash2 (x)) % S.
 If (hash (x) + 1*hash2 (x)) % S is also full, then we try (hash (x) + 2*hash2 (x))
% S.
 If (hash (x) + 2*hash2 (x)) % S is also full, then we try (hash (x) + 3*hash2 (x))
% S.
 First hash function is typically hash1 (key) = key % TABLE_SIZE.
 A popular second hash function is: hash2 (key) = PRIME – (key %
PRIME) where PRIME is a prime smaller than the TABLE_SIZE.
 A good second hash function is:
 It must never evaluate to zero.
 Must make sure that all cells can be probed.
Example:
 Let us take hash function as “key mod 7” and sequence of keys as 50, 700, 76,
85, 92, 73, 101.

12 Prepared by M.Kamarajan, AP/CSE, PSNACET


Comparison of above three:
 Linear probing has the best cache performance but suffers from clustering. One
more advantage of linear probing is easy to compute.
 Quadratic probing lies between the two in terms of cache performance and
clustering.
 Double hashing has poor cache performance but no clustering. Double hashing
requires more computation time as two hash functions need to be computed.

S.No Separate Chaining Open Addressing

Open addressing requires more


1 Chaining is simpler to implement.
computation.
In chaining, hash table never fills
In open addressing, table may become
2 up, we can always add more
full.
elements to chain.
Chaining is less sensitive to the Open addressing requires extra care for
3
hash function or load factors. to avoid clustering and load factor.
Chaining is mostly used when it is
Open addressing is used when the
unknown how many and how
4 frequency and number of keys is
frequently keys may be inserted or
known.
deleted.

13 Prepared by M.Kamarajan, AP/CSE, PSNACET


Cache performance of chaining is Open addressing provides better cache
5 not good as keys are stored using performance as everything is stored in
linked list. the same table.
Wastage of space (Some parts of
In open addressing, a slot can be used
6 hash table in chaining are never
even if an input doesn’t map to it.
used).
7 Chaining uses extra space for links. No links in open addressing.

14 Prepared by M.Kamarajan, AP/CSE, PSNACET

You might also like