Unit-5 - Data Structures Using C
Unit-5 - Data Structures Using C
(Sequential Search)
What is Search?
Search is a process of finding a value in a list of values. In other words, searching is the process of
locating given value position in a list of values.
Example
Consider the following list of element and search element...
void main(){
int list[20],size,i,sElement;
1
// Linear Search Logic
for(i = 0; i < size; i++)
{
if(sElement == list[i])
{
printf("Element is found at %d index", i);
break;
}
}
if(i == size)
printf("Given element is not found in the list!!!");
getch();
}
Example
Consider the following list of element and search element 4...
2
Binary Search Program in C Programming Language
#include<stdio.h>
#include<conio.h>
void main()
{
int first, last, middle, size, i, sElement, list[100];
clrscr();
first = 0;
last = size - 1;
middle = (first+last)/2;
3
Sorting Techniques
Sorting is the process of arranging a list of elements in a particular order (Ascending or Descending).
Selection Sort
Selection Sort algorithm is used to arrange a list of elements in a particular order (Ascending or
Descending). In selection sort, the first element in the list is selected and it is compared repeatedly
with remaining all the elements in the list. If any element is smaller than the selected element (for
Ascending order), then both are swapped. Then we select the element at second position in the list
and it is compared with remaining all elements in the list. If any element is smaller than the selected
element, then both are swapped. This procedure is repeated till the entire list is sorted.
#include <stdio.h>
#include <conio.h>
int main()
{
int array[100], n, c, d, swap;
clrscr();
clrscr();
printf("\nProgram For Insertion Sort ");
printf("\nEnter number of elements : ");
scanf("%d",&number_of_elements);
5
}
printf("\n");
getch();
return 0;
}
Bubble Sorting
Bubble Sort is an algorithm which is used to sort N elements that are given in a memory for eg: an
Array with N number of elements. Bubble Sort compares all the element one by one and sort them
based on their values.
It is called Bubble sort, because with each iteration the smaller element in the list bubbles up towards
the first place, just like a water bubble rises up to the water surface.
Sorting takes place by stepping through all the data items one-by-one in pairs and comparing
adjacent data items and swapping each pair that is out of order.
void main( )
{
int arr[5] = { 25, 17, 31, 13, 2 } ;
int i, j, temp ;
clrscr( ) ;
getch( ) ;
}
8
Like we can see in the above example, merge sort first breaks the unsorted list into sorted sublists,
and then keep merging these sublists, to finlly get the complete sorted list.
while(i <= q)
{
b[k++] = a[i++];
}
while(j <= r)
{
b[k++] = a[j++];
}
9
Radix sort
Radix sort was developed for sorting large integers, but it treats an integer as a string of digits, so it is
really a string sorting algorithm.
Radix sort is a non-comparative sorting algorithm that sorts data with keys by grouping keys by the
individual digits which share the same significant position and value.
Radix Sort arranges the elements in order by comparing the digits of the numbers.
while(large > 0)
{
num++;
large = large/10;
}
for(passes=0 ; passes < num ; passes++)
{
for(k=0 ; k< 10 ; k++)
{
buck[k] = 0;
}
for(i=0 ; i< n ;i++)
{
l = ((arr[i]/div)%10);
bucket[l][buck[l]++] = arr[i];
}
i=0;
for(k=0 ; k < 10 ; k++)
{
for(j=0 ; j < buck[k] ; j++)
{
arr[i++] = bucket[k][j];
}
}
div*=10;
}
}
}
10
Shell sort
Shell sort is a highly efficient sorting algorithm and is based on insertion sort algorithm. This
algorithm avoids large shifts as in case of insertion sort, if the smaller value is to the far right and has
to be moved to the far left.
This algorithm uses insertion sort on a widely spread elements, first to sort them and then sorts the
less widely spaced elements. This spacing is termed as interval. This interval is calculated based on
Knuth's formula as −
Knuth's Formula
h = h * 3 + 1
where −
h is interval with initial value 1
This algorithm is quite efficient for medium-sized data sets as its average and worst case complexity
are of Ο(n), where n is the number of items.
How Shell Sort Works?
Let us consider the following example to have an idea of how shell sort works. We take the same
array we have used in our previous examples. For our example and ease of understanding, we take
the interval of 4. Make a virtual sub-list of all values located at the interval of 4 positions. Here these
values are {35, 14}, {33, 19}, {42, 27} and {10, 14}
We compare values in each sub-list and swap them (if necessary) in the original array. After this
step, the new array should look like this −
Then, we take interval of 2 and this gap generates two sub-lists - {14, 27, 35, 42}, {19, 10, 33, 44}
11
We compare and swap the values, if required, in the original array. After this step, the array should
look like this −
Finally, we sort the rest of the array using interval of value 1. Shell sort uses insertion sort to sort the
array.
Algorithm
Following is the algorithm for shell sort.
increment = 3;
while (increment > 0)
{
for (i=0; i < array_size; i++)
{
j = i;
temp = numbers[i];
while ((j >= increment) && (numbers[j-increment] > temp))
{
numbers[j] = numbers[j - increment];
j = j - increment;
}
numbers[j] = temp;
12
}
if (increment/2 != 0)
increment = increment/2;
else if (increment == 1)
increment = 0;
else
increment = 1;
}
}
Hash Table
Direct-address table
If the keys are drawn from the reasoning small universe U = {0, 1, . . . , m-1} of keys, a solution is to
use a Table T[0, . m-1], indexed by keys. To represent the dynamic set, we use an array, or direct-
address table, denoted by T[0 . . m-1], in which each slot corresponds to a key in the universe.
Following figure illustrates the approach.
Each key in the universe U i.e., Collection, corresponds to an index in the table T[0 . . m-1]. Using this
approach, all three basic operations (dictionary operations) take θ(1) in the worst case.
Hash Tables
When the size of the universe is much larger the same approach (direct address table) could still work
in principle, but the size of the table would make it impractical. A solution is to map the keys onto a
small range, using a function called a hash function. The resulting data structure is called hash table.
With direct addressing, an element with key k is stored in slot k. With hashing =, this same element is
stored in slot h(k); that is we use a hash function h to compute the slot from the key. Hash function
maps the universe U of keys into the slot of a hash table T[0 . . .m-1].
h: U → {0, 1, . . ., m-1}
More formally, suppose we want to store a set of size n in a table of size m. The ratio α = n/m is called
a load factor, that is, the average number of elements stored in a Table. Assume we have a hash
function h that maps each key k U to an integer name h(k) [0 . . m-1]. The basic idea is to store
key k in location T[h(k)].
13
Typical, hash functions generate "random looking" valves. For example, the following function usually
works well
h(k) = k mod m where m is a prime number.
Is there any point of the hash function? Yes, the point of the hash function is to reduce the range of
array indices that need to be handled.
Collision
As keys are inserted in the table, it is possible that two keys may hash to the same table slot. If the
hash function distributes the elements uniformly over the table, the number of conclusions cannot be
too large on the average, but the birthday paradox makes it very likely that there will be at least one
collision, even for a lightly loaded table
A hash function h map the keys k and j to the same slot, so they collide.
There are two basic methods for handling collisions in a hash table: Chaining and Open addressing.
Each slot T[j] contains a linked list of all the keys whose hash value is j. For example, h(k1) =
h(kn) and h(k5) = h(k2) = h(k7).
14
The worst case running time for insertion is O(1).
Deletion of an element x can be accomplished in O(1) time if the lists are doubly linked.
In the worst case behavior of chain-hashing, all n keys hash to the sameslot, creating a list of
length n. The worst-case time for search is thus θ(n) plus the time to compute the hash function.
Figure
A good hash function satisfies the assumption of simple uniform hashing, each element is equally
likely to hash into any of the m slots, independently of where any other element has hash to. But
usually it is not possible to check this condition because one rarely knows the probability distribution
according to which the keys are drawn.
In practice, we use heuristic techniques to create a hash function that perform well. One good
approach is to derive the hash value in a way that is expected to be independent of any patterns that
might exist in the data (division method).
Most hash function assume that the universe of keys is the set of natural numbers. Thus, its keys are
not natural to interpret than as natural numbers.
15
2. The Multiplication Method
Two step process
Step 1: Multiply the key k by a constant 0< A < 1 and extract the fraction part of kA.
Step 2: Multiply kA by m and take the floor of the result.
The hash function using multiplication method is:
h(k) = ëm(kA mod 1)û
where "kA mod 1" means the fractional part of kA, that is, kA - ëkAû.
Advantage of this method is that the value of m is not critical and can be implemented on most
computers.
A reasonable value of constant A is
» (sqrt5 - 1) /2
suggested by Knuth's Art of Programming.
3. Universal Hashing
Open Addressing
This is another way to deal with collisions.
In this technique all elements are stored in the hash table itself. That is, each table entry contains
either an element or NIL. When searching for element (or empty slot), we systematically examine
slots until we found an element (or empty slot). There are no lists and no elements stored outside the
table. That implies that table can completely "fill up"; the load factor α can never exceed 1.Advantage
of this technique is that it avoids pointers (pointers need space too). Instead of chasing pointers, we
compute the sequence of slots to be examined. To perform insertion, we successively examine or
probe, the hash table until we find an empty slot. The sequence of slots probed "depends upon the
key being inserted." To determine which slots to probe, the hash function includes the probe number
as a second input. Thus, the hash function becomes
16