LEARNING MATERIALS-Algorithm-UNIT 3 Modified On 28.08.2024
LEARNING MATERIALS-Algorithm-UNIT 3 Modified On 28.08.2024
LEARNING MATERIAL
Department: Computer Science and Technology
Semester: 3rd
Subject: ALGORITHM
UNIT: (3) Searching
Teacher: Debasish Hati
__________________________________________________________________________________________________________
CONTENTS:
1. Linear search algorithm
2. Binary search algorithm
3. Computation of best, average and worst case time complexity of linear and binary search.
4. Binary search tree: algorithm, searching time and space complexity.
5. Balanced search trees: what is the signification and advantage of height balancing? Insertion, deletion and
searching algorithm of deferent types of balanced search trees and their comparative study.
6. Hashing, hash tables, hash function, collision resolving techniques..
7. Symbol table
__________________________________________________________________________________________________________
Searching
Searching is a process of finding a particular element among several given elements.
The search is successful if the required element is found.
Otherwise, the search is unsuccessful.
Searching Algorithms-
▫ Searching Algorithms are a family of algorithms used for the purpose of searching.
▫ The searching of an element in the given array may be carried out in the following two ways-
1. LINEAR SEARCH :-
Linear Search is the simplest searching algorithm.
It traverses the array sequentially to locate the required element.
It searches for an element by comparing it with each element of the array one by one.
So, it is also called as Sequential Search.
Linear Search Algorithm is applied when-
No information is given about the array.
The given array is unsorted or the elements are unordered.
The list of data items is smaller.
a. Best Case:
The Best Case will take place if: The element to be search is on the first index.
The number of comparisons in this case is 1. Thereforce, Best Case Time Complexity of Linear
Search is O(1).
b. Average Case:
Let there be N distinct numbers: a1, a2, ..., a(N-1), aN
We need to find element P.
There are two cases:
Case 1: The element P can be in N distinct indexes from 0 to N-1.
Case 2: There will be a case when the element P is not present in the list.
There are N case 1 and 1 case 2. So, there are N+1 distinct cases to consider in total.
If element P is in index K, then Linear Search will do K+1 comparisons.
Number of comparisons for all cases in case 1 = Comparisons if element is in index 0 +
Comparisons if element is in index 1 + ... + Comparisons if element is in index N-1
= 1 + 2 + ... + N
= N * (N+1) / 2 comparisons
If element P is not in the list, then Linear Search will do N comparisons.
Number of comparisons for all cases in case 2 = N
Therefore, total number of comparisons for all N+1 cases = N * (N+1) / 2 + N
= N * ((N+1)/2 + 1)
Average number of comparisons = ( N * ((N+1)/2 + 1) ) / (N+1)
= N/2 + N/(N+1)
The dominant term in "Average number of comparisons" is N/2. So, the Average Case Time
Complexity of Linear Search is O(N).
c. Worst Case:
The worst case will take place if:
The element to be search is in the last index
The element to be search is not present in the list
In both cases, the maximum number of comparisons take place in Linear Search which is equal to
N comparisons.
Hence, the Worst Case Time Complexity of Linear Search is O(N).
Number of Comparisons in Worst Case: N
2. BINARY SEARCH :-
Binary Search is one of the fastest searching algorithms.
It is used for finding the location of an element in a linear array.
It works on the principle of divide and conquer technique.
Binary Search Algorithm can be applied only on Sorted arrays.
So, the elements must be arranged in-
▫ Either ascending order if the elements are numbers.
▫ Or dictionary order if the elements are strings.
To apply binary search on an unsorted array,
▫ First, sort the array using some sorting technique.
▫ Then, use binary search algorithm.
Explanation
▫ Binary Search Algorithm searches an element by comparing it with the middle most element of
the array.
▫ Then, following three cases are possible-
▫ Case-01
If the element being searched is found to be the middle most element, its index is returned.
▫ Case-02
If the element being searched is found to be greater than the middle most element, then its
search is further continued in the right sub array of the middle most element.
▫ Case-03
If the element being searched is found to be smaller than the middle most element, then its
search is further continued in the left sub array of the middle most element.
This iteration keeps on repeating on the sub arrays until the desired element is found or size
of the sub array reduces to zero.
Time Complexity:
a. Best Case:
The best case of Binary Search occurs when: The element to be search is in the middle of the list
In this case, the element is found in the first step itself and this involves 1 comparison.
Therefore, Best Case Time Complexity of Binary Search is O(1).
b. Average Case:
Let there be N distinct numbers: a1, a2, ..., a(N-1), aN
We need to find element P.
There are two cases:
Case 1: The element P can be in N distinct indexes from 0 to N-1.
Case 2: There will be a case when the element P is not present in the list.
There are N case 1 and 1 case 2. So, there are N+1 distinct cases to consider in total.
If element P is in index K, then Binary Search will do K+1 comparisons.
This is because: The element at index N/2 can be found in 1 comparison as Binary Search starts
from middle.
Similarly, in the 2nd comparisons, elements at index N/4 and 3N/4 are compared based on the
result of 1st comparison.
On this line, in the 3rd comparison, elements at index N/8, 3N/8, 5N/8, 7N/8 are compared based
on the result of 2nd comparison.
Based on this, we know that:
Elements requiring 1 comparison: 1
Elements requiring 2 comparisons: 2
Elements requiring 3 comparisons: 4
Therefore, Elements requiring I comparisons: 2^(I-1)
The maximum number of comparisons = Number of times N is divided by 2 so that result is 1 =
Comparisons to reach 1st element = logN comparisons
I can vary from 0 to logN
Total number of comparisons = 1 * (Elements requiring 1 comparison) + 2 * (Elements requiring 2
comparisons) + ... + logN * (Elements requiring logN comparisons)
Total number of comparisons = 1 * (1) + 2 * (2) + 3 * (4) + ... + logN * (2^(logN-1))
Total number of comparisons = 1 + 4 + 12 + 32 + ... = 2^logN * (logN - 1) + 1
Total number of comparisons = N * (logN - 1) + 1
Total number of cases = N+1
Therefore, average number of comparisons = ( N * (logN - 1) + 1 ) / (N+1)
Average number of comparisons = N * logN / (N+1) - N/(N+1) + 1/(N+1)
The dominant term is N * logN / (N+1) which is approximately logN. Therefore, Average Case
Time Complexity of Binary Search is O(logN).
c. Worst Case:
The worst case of Binary Search occurs when: The element is to search is in the first index or last
index
In this case, the total number of comparisons required is logN comparisons.
Therefore, Worst Case Time Complexity of Binary Search is O(logN).
It eliminates half of the list from further searching by using the result of each comparison.
It indicates whether the element being searched is before or after the current position in the list.
This information is used to narrow the search.
For large lists of data, it works significantly better than linear search.
Linear Search and Binary Search are two fundamental algorithms used to find elements in a list or array.
Here's a comparison between the two:
1. Definition:
Linear Search: A straightforward search algorithm that checks each element in a list one by one from the start
to the end until the desired element is found or the list is exhausted.
Binary Search: A more efficient search algorithm that works on sorted lists by repeatedly dividing the search
interval in half. It compares the target value with the middle element and eliminates half of the list from
further consideration.
2. Time Complexity:
3. Space Complexity:
4. Input Requirements:
Linear Search: No special requirements; it works on both sorted and unsorted lists.
Binary Search: Requires the list to be sorted beforehand. If the list is unsorted, it must first be sorted, which
would take additional time.
5. Efficiency:
Linear Search: Less efficient for large lists because it potentially needs to check every element.
Binary Search: Much more efficient for large sorted lists, as it reduces the number of elements to check
significantly with each step.
6. Use Cases:
Linear Search: Useful when dealing with small or unsorted datasets, or when the overhead of sorting the list
for Binary Search is not justified.
Binary Search: Preferred for large, sorted datasets where the time savings from the logarithmic complexity
outweigh the costs of sorting (if needed).
7. Example:
Linear Search:
Binary Search:
Linear Search is simple and works on any list but is less efficient for large datasets.
Binary Search is much faster but requires a sorted list and is more complex to implement.
2×3C3 / 3+1
6C3 / 4
5
If three distinct keys are A, B and C, then 5 distinct binary search trees are-
2. Insert 70-
As 70 > 50, so insert 70 to the right of 50.
3. Insert 60-
As 60 > 50, so insert 60 to the right of 50.
As 60 < 70, so insert 60 to the left of 70.
4. Insert 20-
As 20 < 50, so insert 20 to the left of 50.
5. Insert 90-
As 90 > 50, so insert 90 to the right of 50.
As 90 > 70, so insert 90 to the right of 70.
6. Insert 10-
As 10 < 50, so insert 10 to the left of 50.
As 10 < 20, so insert 10 to the left of 20.
7. Insert 40-
As 40 < 50, so insert 40 to the left of 50.
As 40 > 20, so insert 40 to the right of 20.
8. Insert 100-
As 100 > 50, so insert 100 to the right of 50.
As 100 > 70, so insert 100 to the right of 70.
As 100 > 90, so insert 100 to the right of 90.
The number of nodes in the left subtree and right subtree of the root respectively is _____.
a. (4, 7)
b. (7, 4)
c. (8, 3)
d. (3, 8)
Solution-
Using the above discussed steps, we will construct the binary search tree.
The resultant binary search tree will be-
Clearly,
▫ Number of nodes in the left subtree of the root = 7
▫ Number of nodes in the right subtree of the root = 4
Thus, Option (b) is correct.
B. Problem-02:
How many distinct binary search trees can be constructed out of 4 distinct keys?
a. 5
b. 14
c. 24
d. 35
Solution-
Let n = 4 and p = 3.
Then, given options reduce to-
a. 3
b. 4
c. 1
d. 2
Our binary search tree will be as shown-
BST Traversal-
A binary search tree is traversed in exactly the same way a binary tree is traversed.
In other words, BST traversal is same as binary tree traversal.
Example- Consider the following binary search tree-
Now, let us write the traversal sequences for this binary search tree-
1. Preorder Traversal- 100 , 20 , 10 , 30 , 200 , 150 , 300
2. Inorder Traversal- 10 , 20 , 30 , 100 , 150 , 200 , 300
3. Postorder Traversal- 10 , 30 , 20 , 150 , 300 , 200 , 100
Important Notes-
Note-01: Inorder traversal of a binary search tree always yields all the nodes in increasing order.
Note-02: Unlike Binary Trees
A binary search tree can be constructed using only preorder or only postorder traversal result.
This is because inorder traversal can be obtained by sorting the given result in increasing order.
1. Search Operation-
▫ Search Operation is performed to search a particular element in the Binary Search Tree.
▫ Rules- For searching a given key in the BST,
Compare the key with the value of root node.
If the key is present at the root node, then return the root node.
If the key is greater than the root node value, then recur for the root node’s right subtree.
If the key is smaller than the root node value, then recur for the root node’s left subtree.
▫ Example-
Consider key = 45 has to be searched in the given BST-
We start our search from the root node 25.
As 45 > 25, so we search in 25’s right subtree.
As 45 < 50, so we search in 50’s left subtree.
As 45 > 35, so we search in 35’s right subtree.
As 45 > 44, so we search in 44’s right subtree but 44 has no subtrees.
So, we conclude that 45 is not present in the above BST.
2. Insertion Operation-
▫ Insertion Operation is performed to insert an element in the Binary Search Tree.
▫ Rules- The insertion of a new key always takes place as the child of some leaf node.
▫ For finding out the suitable leaf node,
Search the key to be inserted from the root node till some leaf node is reached.
Once a leaf node is reached, insert the key as child of that leaf node.
▫ Example-
Consider the following example where key = 40 is inserted in the given BST-
▫ Case-02: Deletion Of A Node Having Only One Child- Just make the child of the deleting node, the child
of its grandparent.
Example-
Consider the following example where node with value = 30 is deleted from the BST-
▫ Case-02: Deletion Of A Node Having Two Children- A node with two children may be deleted from the
BST in the following two ways-
Method-01:
In this case we have to traverse from root to the deepest leaf node and in that case height of the tree
becomes n and as we have seen above time taken is same as the height of the tree so time
complexity in worst case becomes O(n).
Space complexity: The space complexity of searching a node in a BST would be O(n) with 'n' being the
depth of the tree(number of nodes present in a tree) since at any point of time maximum number of
stack frames that could be present in memory is 'n'.
The difference between the heights of the left and the right subtree for any node is not more than
one.
The left subtree is balanced.
The right subtree is balanced.
Note: An empty tree is also height-balanced.
The above tree is a binary search tree and also a height-balanced tree.
Suppose we want to want to find the value 79 in the above tree. First, we compare the value of the
root node. Since the value of 79 is greater than 35, we move to its right child, i.e., 48. Since the value
79 is greater than 48, so we move to the right child of 48.
The value of the right child of node 48 is 79. The number of hops required to search the element 79 is
2.
Similarly, any element can be found with at most 2 jumps because the height of the tree is 2.
So it can be seen that any value in a balanced binary tree can be searched in O(logN) time where N is
the number of nodes in the tree. But if the tree is not height-balanced then in the worst case, a
search operation can take O(N) time.
Use recursion and visit the left subtree and right subtree of each node:
o Check the height of the left subtree and right subtree.
o If the absolute difference between their heights is at most 1 then that node is height-balanced.
o Otherwise, that node and the whole tree is not balanced.
AVL Tree-
AVL trees are special kind of binary search trees.
In AVL trees, height of left subtree and right subtree of every node differs by at most one.
AVL trees are also called as self-balancing binary search trees.
Example- Following tree is an example of AVL tree-
Balance Factor-
Balance factor is defined for every node.
Balance factor of a node = Height of its left subtree – Height of its right subtree
Balance factor of every node is either 0 or 1 or –1
Kinds of Rotations-
There are 4 kinds of rotations possible in AVL Trees-
1. LL Rotation:
2. RR Rotation:
3. LR Rotation:
4. RL Rotation:
▫ Solution-
1. Step-01: Insert 50
2. Step-02: Insert 20
As 20 < 50, so insert 20 in 50’s left sub tree.
3. Step-03: Insert 60
As 60 > 50, so insert 60 in 50’s right sub tree.
4. Step-04: Insert 10
As 10 < 50, so insert 10 in 50’s left sub tree.
As 10 < 20, so insert 10 in 20’s left sub tree.
5. Step-05: Insert 8
8. Step-08: Insert 46
As 46 > 20, so insert 46 in 20’s right sub tree.
As 46 < 50, so insert 46 in 50’s left sub tree.
As 46 > 32, so insert 46 in 32’s right sub tree.
9. Step-09: Insert 11
As 11 < 20, so insert 11 in 20’s left sub tree.
As 11 > 10, so insert 11 in 10’s right sub tree.
As 11 < 15, so insert 11 in 15’s left sub tree.
▫ Example: Delete the node 30 from the AVL tree shown in the following image.
▫ Solution: In this case, the node B has balance factor 0, therefore the tree will be rotated by using
R0 rotation as shown in the following image. The node B(10) becomes the root, while the node A
is moved to its right. The right child of node B will now become the left child of node A.
▫ Solution : Deleting 55 from the AVL Tree disturbs the balance factor of the node 50 i.e. node A
which becomes the critical node. This is the condition of R1 rotation in which, the node A will be
moved to its right (shown in the image below). The right of B is now become the left of A (i.e. 45).
The process involved in the solution is shown in the following image.
▫ Solution: in this case, node B has balance factor -1. Deleting the node 60, disturbs the balance factor
of the node 50 therefore, it needs to be R-1 rotated. The node C i.e. 45 becomes the root of the tree
with the node B(40) and A(50) as its left and right child.
Hashing in Data Structure
Hashing is a well-known technique to search any particular element among several elements.
It minimizes the number of comparisons while performing the search.
Advantage-
Hashing is extremely efficient.
The time taken by it to perform the search does not depend upon the total number of elements.
It completes the search with constant time complexity O(1).
Hashing Mechanism-
An array data structure called as Hash table is used to store the data items.
Based on the hash key value, data items are inserted into the hash table.
Hash function takes the data item as an input and returns a small integer value as an output.
The small integer value is called as a hash value.
Hash value of the data item is then used as an index for storing it into the hash table.
It is efficiently computable.
It minimizes the number of collisions.
It distributes the keys uniformly over the table.
Collision in Hashing-
In hashing,
If Load factor (α) = constant, then time complexity of Insert, Search, Delete = Θ(1)
PRACTICE PROBLEM BASED ON SEPARATE CHAINING-
Problem- Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table- 50,
700, 76, 85, 92, 73 and 101. Use separate chaining technique for collision resolution.
Solution- The given sequence of keys will be inserted in the hash table as-
1. Step-01:
Draw an empty hash table.
For the given hash function, the possible range of hash values is [0, 6].
So, draw an empty hash table consisting of 7 buckets as-
2. Step-02:
Insert the given keys in the hash table one by one.
The first key to be inserted in the hash table = 50.
Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
So, key 50 will be inserted in bucket-1 of the hash table as-
3. Step-03:
The next key to be inserted in the hash table = 700.
Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
So, key 700 will be inserted in bucket-0 of the hash table as-
4. Step-04:
The next key to be inserted in the hash table = 76.
Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
So, key 76 will be inserted in bucket-6 of the hash table as-
5. Step-05:
The next key to be inserted in the hash table = 85.
Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
Since bucket-1 is already occupied, so collision occurs.
Separate chaining handles the collision by creating a linked list to bucket-1.
So, key 85 will be inserted in bucket-1 of the hash table as-
6. Step-06:
The next key to be inserted in the hash table = 92.
Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
Since bucket-1 is already occupied, so collision occurs.
Separate chaining handles the collision by creating a linked list to bucket-1.
So, key 92 will be inserted in bucket-1 of the hash table as-
7. Step-07:
The next key to be inserted in the hash table = 73.
Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
So, key 73 will be inserted in bucket-3 of the hash table as-
8. Step-08:
The next key to be inserted in the hash table = 101.
Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
Since bucket-3 is already occupied, so collision occurs.
Separate chaining handles the collision by creating a linked list to bucket-3.
So, key 101 will be inserted in bucket-3 of the hash table as-
2. Open Addressing-
Unlike separate chaining, all the keys are stored inside the hash table.
No key is stored outside the hash table.
Techniques used for open addressing are-
1. Linear Probing
2. Quadratic Probing
3. Double Hashing
Operations in Open Addressing-
▫ Let us discuss how operations are performed in open addressing-
▫ Insert Operation-
Hash function is used to compute the hash value for a key to be inserted
Hash value is then used as an index to store the key in the hash table.
▫ In case of collision,
Probing is performed until an empty bucket is found.
Once an empty bucket is found, the key is inserted.
Probing is performed in accordance with the technique used for open addressing.
▫ Search Operation-
To search any particular key,
Disadvantage-
The main problem with linear probing is clustering.
Many consecutive elements form groups.
Then, it takes time to search an element or to find an empty bucket.
Time Complexity-
Worst time to search an element in linear probing is O (table size).
This is because-
Even if there is only one element present and all other elements are deleted.
Then, “deleted” markers present in the hash table makes search the entire table.
2. Quadratic Probing-
When collision occurs, we probe for i2‘th bucket in ith iteration.
We keep probing until an empty bucket is found.
3. Double Hashing-
We use another hash function hash2(x) and look for i * hash2(x) bucket in i th iteration.
It requires more computation time as two hash functions need to be computed.
Comparison of Open Addressing Techniques-
▫ In open addressing, the value of load factor always lie between 0 and 1.
▫ This is because-
In open addressing, all the keys are stored inside the hash table.
So, size of the table is always greater or at least equal to the number of keys stored in the table.
PRACTICE PROBLEM BASED ON OPEN ADDRESSING-
Problem- Using the hash function ‘key mod 7’, insert the following sequence of keys in the hash table- 50,
700, 76, 85, 92, 73 and 101. Use linear probing technique for collision resolution.
Solution- The given sequence of keys will be inserted in the hash table as-
1. Step-01:
Draw an empty hash table.
For the given hash function, the possible range of hash values is [0, 6].
So, draw an empty hash table consisting of 7 buckets as-
2. Step-02:
Insert the given keys in the hash table one by one.
The first key to be inserted in the hash table = 50.
Bucket of the hash table to which key 50 maps = 50 mod 7 = 1.
So, key 50 will be inserted in bucket-1 of the hash table as-
3. Step-03:
The next key to be inserted in the hash table = 700.
Bucket of the hash table to which key 700 maps = 700 mod 7 = 0.
So, key 700 will be inserted in bucket-0 of the hash table as-
4. Step-04:
The next key to be inserted in the hash table = 76.
Bucket of the hash table to which key 76 maps = 76 mod 7 = 6.
So, key 76 will be inserted in bucket-6 of the hash table as-
5. Step-05:
The next key to be inserted in the hash table = 85.
Bucket of the hash table to which key 85 maps = 85 mod 7 = 1.
Since bucket-1 is already occupied, so collision occurs.
To handle the collision, linear probing technique keeps probing linearly until an empty bucket is
found.
The first empty bucket is bucket-2.
So, key 85 will be inserted in bucket-2 of the hash table as-
6. Step-06:
The next key to be inserted in the hash table = 92.
Bucket of the hash table to which key 92 maps = 92 mod 7 = 1.
Since bucket-1 is already occupied, so collision occurs.
To handle the collision, linear probing technique keeps probing linearly until an empty bucket is
found.
The first empty bucket is bucket-3.
So, key 92 will be inserted in bucket-3 of the hash table as-
7. Step-07:
The next key to be inserted in the hash table = 73.
Bucket of the hash table to which key 73 maps = 73 mod 7 = 3.
Since bucket-3 is already occupied, so collision occurs.
To handle the collision, linear probing technique keeps probing linearly until an empty bucket is
found.
The first empty bucket is bucket-4.
So, key 73 will be inserted in bucket-4 of the hash table as-
8. Step-08:
The next key to be inserted in the hash table = 101.
Bucket of the hash table to which key 101 maps = 101 mod 7 = 3.
Since bucket-3 is already occupied, so collision occurs.
To handle the collision, linear probing technique keeps probing linearly until an empty bucket is
found.
The first empty bucket is bucket-5.
So, key 101 will be inserted in bucket-5 of the hash table as-
Data structure for symbol table
A compiler contains two type of symbol table: global symbol table and scope symbol table.
Global symbol table can be accessed by all the procedures and scope symbol table.
The scope of a name and symbol table is arranged in the hierarchy structure as shown below:
1. int value=10;
2.
3. void sum_num()
4. {
5. int num_1;
6. int num_2;
7.
8. {
9. int num_3;
10. int num_4;
11. }
12.
13. int num_5;
14.
15. {
16. int_num 6;
17. int_num 7;
18. }
19. }
20.
21. Void sum_id
22. {
23. int id_1;
24. int id_2;
25.
26. {
27. int id_3;
28. int id_4;
29. }
30.
31. int num_5;
32. }
The above grammar can be represented in a hierarchical data structure of symbol tables:
The global symbol table contains one global variable and two procedure names. The name mentioned in
the sum_num table is not available for sum_id and its child tables.
Data structure hierarchy of symbol table is stored in the semantic analyzer. If you want to search the name
in the symbol table then you can search it using the following algorithm:
▫ First a symbol is searched in the current symbol table.
▫ If the name is found then search is completed else the name will be searched in the symbol table of
parent until,
▫ The name is found or global symbol is searched.
I.
ii.
a) only i
b) only i and ii
c) only ii
d) i is not a binary search tree
15. What is the maximum height of an AVL tree with p nodes?
a) p
b) log(p)
c) log(p)/2
d) p⁄2
SAQ:
1. Construct a binary search tree with the below information.
The preorder traversal of a binary search tree 10, 4, 3, 5, 11, 12.
2. Write a code to find the maximum element in a binary search tree?
if(left_tree_height== -1)
return left_tree_height
right_tree_height= avl(right_of_root)
if(right_tree_height==-1)
return right_tree_height
5. Write a code which uses recursion for linear search.
public void linSearch(int[] arr, int first, int last, int key)
{
if(first == last)
{
System.out.print("-1");
}
else
{
if(arr[first] == key)
{
System.out.print(first);
}
else
{
linSearch(arr, first+1, last, key);
}
}
}
6. Write a code that does binary search using recursion.
public static int recursive(int arr[], int low, int high, int key)
{
int mid = low + (high - low)/2;
if(arr[mid] == key)
{
return mid;
}
else if(arr[mid] < key)
{
return recursive(arr,mid+1,high,key);
}
else
{
return recursive(arr,low,mid-1,key);
}
}
Binary search is simpler and faster than linear search. Binary search the array to be searched is divided
into two parts, one of which is ignored as it will not contain the required element
One essential condition for the binary search is that the array which is to be searched, should be
arranged in order.
9.Define hashing?
All the large collection of data are stored in a hash table. The size of the hash table is usually fixed and
it is bigger than the number of elements we are going to store.
The load factor defines the ration of the number of data to be stored to the size of the hash table
Static hashing-In static hashing the process is carried out without the usage of an index structure.
Dynamic hashing- It allows dynamic allocation of buckets, i.e. according to the demand of database the
buckets can be allocated making this approach more efficient.
Rehashing is technique also called as double hashing used in hash tables to resolve hash collisions,
cases when two different values to be searched for produce the same hash key.
It is a popular collision-resolution technique in open-addressed hash tables.
BROAD:
1. Using both linear search and binary search from the following data:
a. 13, 4, 6, 10, 8, 9, 7, 5, 1, 3 – search(10)
b. 50, 40, 10, 30, 38, 92, 46, 32, 87, 24, 50 – search(32)
c. 17, 30, 27, 1, 7, 54, 67, 45, 72, 83 – search(83)
2. Create binary search tree for the following values:
a. 13, 4, 6, 10, 8, 9, 7, 5, 1, 3
b. 50, 40, 10, 30, 38, 92, 46, 32, 87, 24, 50
c. 17, 30, 27, 1, 7, 54, 67, 45, 72, 83
3. Create AVL tree for the following values and if the tree is unbalanced then balanced it:
a. 13, 4, 6, 10, 8, 9, 7, 5, 1, 3. Then delete (7), delete(3), insert(2)
b. 50, 40, 10, 30, 38, 92, 46, 32, 87, 24. Then delete(24), insert(11)
c. 17, 30, 27, 1, 7, 54, 67, 45, 72, 83. Then delete(1), delete(72), insert(17)
Answer:
1. Different Types of Searching Techniques in C
In C, several searching algorithms can be implemented, each with its specific use cases. The most common
ones are:
Linear Search: Sequentially checks each element of the list until the desired element is found or the list ends.
Binary Search: Efficiently searches a sorted list by repeatedly dividing the search interval in half.
Jump Search: A combination of Linear and Binary Search, where the array is divided into blocks, and Linear
Search is performed within a block.
Interpolation Search: Similar to Binary Search, but instead of dividing the list in half, it estimates the position
of the target value based on the values of the endpoints.
Exponential Search: Useful for unbounded or infinite lists, it first finds a range where the element could be
and then applies Binary Search within that range.
Fibonacci Search: Similar to Binary Search but uses Fibonacci numbers to divide the list.
Hashing: Searches for an element by transforming the search key into an index in an array using a hash
function.
2. Which is the Best Searching Algorithm and Why?
For Unsorted Data: Linear Search is the simplest option but not the most efficient for large datasets.
For Sorted Data: Binary Search is typically the best general-purpose algorithm due to its O(log n) time
complexity, making it highly efficient for large sorted datasets.
For Constant Time Search: Hashing is the most efficient, with an average time complexity of O(1) for search
operations, assuming a good hash function and proper collision handling.
Binary Search is often considered the best for sorted datasets because of its logarithmic time complexity,
which is significantly faster than Linear Search for large datasets.
// If the target is smaller than the mid, it must be in the left subarray
if (arr[mid] > x)
return binarySearch(arr, left, mid - 1, x);
Explanation:
The function binarySearch is called recursively with updated left and right pointers until the target
element x is found or the subarray has no more elements to check.
The middle index mid is calculated, and if the middle element is the target, the index is returned.
If the target is smaller than the middle element, the search continues in the left subarray.
If the target is larger, the search continues in the right subarray.
Unsorted Data: When the list is not sorted, Linear Search is the simplest and most straightforward approach.
Small Datasets: For small datasets, the overhead of more complex algorithms like Binary Search might not be
justified, making Linear Search a practical choice.
Single Pass Requirement: When only a single pass through the data is possible or when the data is being
streamed, Linear Search is often the only viable option.
Data Structures with Non-Contiguous Memory: In data structures like linked lists, where random access is not
possible, Linear Search is typically used.
1. Initialize: Set left = 0 and right = n - 1 where n is the number of elements in the array.
2. Repeat:
o Calculate the middle index: mid = left + (right - left) / 2.
o If arr[mid] == x, return mid.
o If arr[mid] > x, update right = mid - 1.
o If arr[mid] < x, update left = mid + 1.
3. End Condition: If left > right, the element is not in the array, return -1.
O(log2n): At each step, Binary Search reduces the search space by half. If you start with n elements, the
number of elements left to search after each step is n/2, then n/4, then n/8, and so on. The process
continues until the search space is reduced to 1 element.
o The number of steps required to reduce n to 1 is log2(n), hence the time complexity is O(log n).
Hashing is a process in computer science used to map data (usually in the form of keys) to a fixed-size array
or hash table using a hash function. The purpose of hashing is to enable fast data retrieval, making search,
insert, and delete operations efficient, typically achieving O(1) average time complexity.
1. Hash Function: A hash function takes an input (the key) and returns an index in the array where the value
associated with that key should be stored.
o Example: index = hash(key) % table_size
2. Hash Table: The array where the data is stored based on the index generated by the hash function.
3. Collisions: A collision occurs when two keys produce the same index in the hash table.
Since collisions are inevitable due to the limited size of the hash table and the possibility of different keys
producing the same hash value, various techniques are used to resolve them:
1. Chaining:
Description: In chaining, each slot of the hash table points to a linked list (or another data structure) of
elements that hash to the same index. When a collision occurs, the new key-value pair is added to the
linked list at the appropriate index.
Advantages:
o Simple to implement.
o Efficient if the hash function distributes keys uniformly.
o Handles a large number of collisions gracefully.
Disadvantages:
o Requires additional memory for the pointers in the linked lists.
o Performance degrades if too many elements hash to the same index, leading to longer linked lists.
Example:
o Hash table index 3 contains elements 15, 25, 35 (all of which produce the same index) stored in a
linked list.
2. Open Addressing:
Description: In open addressing, all elements are stored directly in the hash table itself. When a
collision occurs, the algorithm searches for the next available slot in the table to place the element.
There are several methods to find this next slot:
o Linear Probing:
If a collision occurs at index i, the algorithm checks i+1, i+2, ... until an empty slot is
found.
Advantages: Simple to implement.
Disadvantages: Clustering can occur, where consecutive elements lead to more collisions,
reducing efficiency.
o Quadratic Probing:
Instead of checking the next consecutive slot, quadratic probing checks i+1^2, i+2^2, ...
until an empty slot is found.
Advantages: Reduces clustering compared to linear probing.
Disadvantages: Can still lead to secondary clustering.
o Double Hashing:
Uses a second hash function to calculate the step size for probing after a collision. For
example, if a collision occurs at index i, the next index to check would be i + step_size,
where step_size = hash2(key).
Advantages: Reduces clustering more effectively.
Disadvantages: Requires careful choice of the second hash function to ensure uniform
distribution.
3. Rehashing:
Description: When a collision occurs and the hash table becomes too full, a new, larger hash table is created,
and all existing elements are hashed again into the new table.
Advantages: Helps in maintaining efficient search operations as the load factor (number of elements/table
size) increases.
Disadvantages: Rehashing can be time-consuming and computationally expensive.