DS Unit-1
DS Unit-1
……………………………………………………………………………………………………………….
Introduction :
A data structure is an arrangement of data either in computer's memory or on the disk storage.
Some common examples of data structures are arrays, linked lists, queues, stacks, binary trees, graphs.
Data structures are widely applied in areas like:
Compiler design
Operating system
Statistical analysis package
DBMS
Numerical analysis
Primitive data structures are the fundamental data types which are supported by a programming
language. Some basic data types are integer, real, and boolean. The terms ‘data type’, ‘basic data type’,
and ‘primitive data type’ are often used interchangeably.
Non-primitive data structures are those data structures which are created using primitive data structures.
Examples of such data structures include linked lists, stacks, trees, and graphs.
Non-primitive data structures can further be classified into two categories: linear and non-linear data
structures.
If the elements of a data structure are stored in a linear or sequential order, then it is a linear data
structure. Examples are arrays, linked lists, stacks, and queues.
If the elements of a data structure are not stored in a sequential order, then it is a non-linear data
structure. Examples are trees and graphs.
Examples of Linear data structure: arrays, linked lists, stacks, and queues.
ARRAYS:
LINKED LISTS:
STACK:
A stack is a linear data structure in which insertion and deletion of elements are done at only one
end, which is known as the top of the stack.
Stack is called a last-in, first-out (LIFO)
structure because the last element which is
added to the stack is the first element which
is deleted from the stack.
Stacks can be implemented using arrays or
linked lists.
Every stack has a variable top associated
with it. Top is used to store the address of
the topmost element of the stack.
It is this position from where the element will be added or deleted. There is another variable MAX,
which is used to store the maximum number of elements that the stack can store.
If top = NULL, then it indicates that the stack is empty and if top = MAX–1, then the stack is full.
A stack supports three basic operations: push, pop, and peep. The push operation adds an element
to the top of the stack. The pop operation removes the element from the top of the stack. And the
peep operation returns the value of the topmost element of the stack (without deleting it).
Queues:
A Queue is a linear data structure in which insertion can be done at rear end and deletion of
elements can be dome at front end.
A queue is a first-in, first-out (FIFO) data
structure in which the element that is
inserted first is the first one to be taken out.
Like stacks, queues can be implemented by using either arrays or linked lists.
A queue is full when rear = MAX – 1, An underflow condition occurs when we try to delete an
element from a queue that is already empty. If front = NULL and rear = NULL, then there is no
element in the queue.
Trees:
A tree is a non-linear data structure which consists of a collection of nodes arranged in a
hierarchical order.
One of the nodes is designated as the root node, and the remaining nodes can be partitioned into
disjoint sets such that each set is a sub-tree of the root
The simplest form of a tree is a binary tree. A binary tree
consists of a root node and left and right sub-trees, where both
sub-trees are also binary trees.
Each node contains a data element, a left pointer which points
to the left sub-tree, and a right pointer which points to the right
sub-tree.
The root element is the topmost node which is pointed by a
‘root’ pointer. If root = NULL then the tree is empty.
Here R is the root node and T1 and T2 are the left and right subtrees of R. If T1 is non-empty,
then T1 is said to be the left successor of R. Likewise, if T2 is non-empty, then it is called the
right successor of R.
Advantage: Provides quick search, insert, and delete operations
Disadvantage: Complicated deletion algorithm
Graphs:
A graph is a non-linear data structure which is a collection of vertices (also called nodes) and
edges that connect these vertices.
A node in the graph may represent a city and the edges connecting
the nodes can represent roads.
A graph can also be used to represent a computer network where
the nodes are workstations and the edges are the network
connections.
Graphs do not have any root node. Rather, every node in the graph can be connected with every
another node in the graph.
Advantage: Best models real-world situations
Disadvantage: Some algorithms are slow and very complex
This section discusses the different operations that can be performed on the various data
structures previously mentioned.
Traversing It means to access each data item exactly once so that it can be processed. For example,
to print the names of all the students in a class.
Searching It is used to find the location of one or more data items that satisfy the given constraint.
Such a data item may or may not be present in the given collection of data items. For example, to
find the names of all the students who secured 100 marks in mathematics.
Inserting It is used to add new data items to the given list of data items. For example, to add the
details of a new student who has recently joined the course.
Deleting It means to remove (delete) a particular data item from the given collection of data items.
For example, to delete the name of a student who has left the course.
Sorting Data items can be arranged in some order like ascending order or descending order
depending on the type of application. For example, arranging the names of students in a class in
an alphabetical order, or calculating the top three winners by arranging the participants’ scores in
descending order and then extracting the top three.
Merging Lists of two sorted data items can be combined to form a single list of sorted data items
Advantages of ADT
• Encapsulation: ADTs provide a way to encapsulate data and operations into a
single unit, making it easier to manage and modify the data structure.
• Abstraction: ADTs allow users to work with data structures without having to
know the implementation details, which can simplify programming and reduce
errors.
• Data Structure Independence: ADTs can be implemented using different data
structures, which can make it easier to adapt to changing needs and requirements.
• Information Hiding: ADTs can protect the integrity of data by controlling
access and preventing unauthorized modifications.
• Modularity: ADTs can be combined with other ADTs to form more complex
data structures, which can increase flexibility and modularity in programming.
Disadvantages of ADT
• Overhead: Implementing ADTs can add overhead in terms of memory and
processing, which can affect performance.
• Complexity: ADTs can be complex to implement, especially for large and
complex data structures.
• Learning Curve: Using ADTs requires knowledge of their implementation and
usage, which can take time and effort to learn.
• Limited Flexibility: Some ADTs may be limited in their functionality or may
not be suitable for all types of data structures.
• Cost: Implementing ADTs may require additional resources and investment,
which can increase the cost of development
OVERVIEW OF TIME AND SPACE COMPLEXITY:
Time Complexity:
The amount of time required for an algorithm to complete its execution is its time complexity.
An algorithm is said to be efficient if it takes the minimum (reasonable) amount of time to
complete its execution.
The number of steps any problem statement is assigned depends on the kind of statement.
For example, comments 0 steps.
1. We introduce a variable, count into the program statement to increment count with initial
value 0.Statement to increment count by the appropriate amount are introduced into the
program.
This is done so that each time a statement in the original program is executes count is incremented
by the step count of that statement.
Algorithm:
Algorithm sum(a,n)
{
s= 0.0;
count = count+1;
for I=1 to n do 8
{
count =count+1;
s=s+a[I];
count=count+1;
}
count=count
+1;
count=count
+1; return s;
}
If the count is zero to start with, then it will be 2n+3 on termination. So each invocation
of sum execute a total of 2n+3 steps.
2. The second method to determine the step count of an algorithm is to build a table in
which we list the total number of steps contributes by each statement.
First determine the number of steps per execution (s/e) of the statement and the total
number of times (ie., frequency) each statement is executed.
By combining these two quantities, the total contribution of all statements, the step count for
the entire algorithm is obtained.
Statement S/e Frequency Total
1. Algorithm 0 - 0
Sum(a,n) 0 - 0
2.{ 1 1 1
3. S=0.0; 1 n+1 n+1
4. for I=1 to n do 1 n n
5. s=s+a[I]; 1 1 1
6. return s; 0 - 0
7. }
Total 2n+3
Space Complexity:
The amount of space occupied by an algorithm is known as Space Complexity. An
algorithm is said to be efficient if it occupies less space and required the minimum amount
of time to complete its execution.
Fixed part:
It varies from problem to problem. It includes the space needed for storing
instructions, constants, variables, and structured variables (like arrays and structures ).
Variable part:
It varies from program to program. It includes the space needed for recursion stack,
and for structured variables that are allocated space dynamically during the runtime of a
program.
The space requirement s(p) of any algorithm p may therefore be written as,
S(P) = c+ Sp(Instance
characteristics) Where ‘c’ is a
constant
Internal Sorting: When all the data that is to be sorted can be accommodated at a
time in the main memory (Usually RAM). Internal sortings has five different
classifications: insertion, selection, exchanging, merging, and distribution sort
External Sorting: When all the data that is to be sorted can’t be accommodated in
the memory (Usually RAM) at the same time and some have to be kept in auxiliary
memory such as hard disk, floppy disk, magnetic tapes etc.
Ex: Natural, Balanced, and Polyphase.
BUBBLE SORT:
Bubble sort works on the repeatedly swapping of adjacent elements until they are not in the
intended order. It is called bubble sort because the movement of array elements is just like the
movement of air bubbles in the water. Bubbles in water rise up to the surface; similarly, the
array elements in bubble sort move to the end in each iteration.
Although it is simple to use, it is primarily used as an educational tool because the performance
of bubble sort is poor in the real world. It is not suitable for large data sets. The average and
worst-case complexity of Bubble sort is O(n2), where n is a number of items.
Algorithm
In the algorithm given below, suppose arr is an array of n elements. The
assumed swap function in the algorithm will swap the values of given array elements.
1. begin BubbleSort(arr)
2. for all array elements
3. if arr[i] > arr[i+1]
4. swap(arr[i], arr[i+1])
5. end if
6. end for
7. return arr
8. end BubbleSort
To understand the working of bubble sort algorithm, let's take an unsorted array. We are taking
a short and accurate array, as we know the complexity of bubble sort is O(n2).
First Pass
Sorting will start from the initial two elements. Let compare them to check which is greater.
Here, 32 is greater than 13 (32 > 13), so it is already sorted. Now, compare 32 with 26.
Here, 26 is smaller than 36. So, swapping is required. After swapping new array will look like
-
Here, 35 is greater than 32. So, there is no swapping required as they are already sorted.
Here, 10 is smaller than 35 that are not sorted. So, swapping is required. Now, we reach at
the end of the array. After first pass, the array will be -
Now, move to the second iteration.
Second Pass
The same process will be followed for second iteration.
Here, 10 is smaller than 32. So, swapping is required. After swapping, the array will be -
Third Pass
The same process will be followed for third iteration.
Here, 10 is smaller than 26. So, swapping is required. After swapping, the array will be -
Fourth pass
Similarly, after the fourth iteration, the array will be -
Hence, there is no swapping required, so the array is completely sorted.
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array is already
sorted. The best-case time complexity of bubble sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled order that
is not properly ascending and not properly descending. The average case time complexity
of bubble sort is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be sorted in
reverse order. That means suppose you have to sort the array elements in ascending order,
but its elements are in descending order. The worst-case time complexity of bubble sort
is O(n2).
2. Space Complexity
Space Complexity O(1)
Stable YES
o The space complexity of bubble sort is O(1). It is because, in bubble sort, an extra variable
is required for swapping.
o The space complexity of optimized bubble sort is O(2). It is because two extra variables are
required in optimized bubble sort.
Example:2
We take an unsorted array for our example.
Bubble sort starts with very first two elements, comparing them to check which one is
greater.
In this case, value 33 is greater than 14, so it is already in sorted locations. Next, we
compare 33 with 27. We find that 27 is smaller than 33 and these two values must
be swapped.
Next we compare 33 and 35. We find that both are in already sorted positions.
Then we move to the next two values, 35 and 10. We know then that 10 is smaller 35.
We swap these values. We find that we have reached the end of the array. After one
iteration, the array should look like this −
To be defined, we are now showing how an array should look like after each
iteration. After the second iteration, it should look like this
Notice that after each iteration, at least one value moves at the end.
And when there's no swap required, bubble sorts learns that an array is completely
sorted.
#include <stdio.h>
int main() {
int arr[] = {5, 3, 8, 4, 2};
int n = sizeof(arr) / sizeof(arr[0]);
bubble_sort(arr, n);
return 0;
}
Output:
2,3,4,5,8
INSERTION SORT:
Insertion Sort is a simple sorting algorithm that builds the final sorted array one item at a time. It
works much like the way you might sort playing cards in your hands—by inserting each card into
its correct position relative to the cards already sorted.
In Insertion sort the list can be divided into two parts, one is sorted list and other is
unsorted list. In each pass the first element of unsorted list is transfers to sorted list by
inserting it in appropriate position or proper place.
The similarity can be
understood from the style we
arrange a deck of cards. This
sort works on the principle of
inserting an element at a
particular position, hence the
name Insertion Sort.
Following are the steps involved in insertion sort:
1. We start by taking the second element of the given array, i.e. element at index
1, the key. The key element here is the new card that we need to add to our existing
sorted set of cards
2. We compare the key element with the element(s) before it, in this case, element at index
0:
o If the key element is less than the first element, we insert the key element
before the first element.
o If the key element is greater than the first element, then we insert it after the first
element.
3. Then, we make the third element of the array as key and will compare it with elements
to it's left and insert it at the proper position.
4. And we go on repeating this, until the array is sorted.
Example 1:
example 2:
How It Works:
1. The algorithm starts with the second element (since a single-element array is trivially sorted).
2. It compares the current element with the elements in the sorted part of the array (to the left).
3. It shifts the elements of the sorted sublist to the right to make space for the current element.
4. It inserts the current element into its correct position.
5. The process repeats until the entire array is sorted.
Algorithm Steps:
insertionSort(arr):
// Shift elements of the sorted sublist that are greater than the key
while j >= 0 and arr[j] > key:
arr[j + 1] = arr[j] // Shift element to the right
j = j - 1
// Insert the key into the correct position
arr[j + 1] = key
Example2:
Complexity of the Insertion Sort Algorithm
To sort an unsorted list with 'n' number of elements, we need to make (1+2+3+......+n-1) = (n (n-
1))/2 number of comparisions in the worst case. If the list is already sorted then it requires 'n' number
of comparisions.
WorstCase: O(n2)
BestCase: Ω(n)
Average Case : Θ(n2)
arr[j + 1] = arr[j];
j--;
arr[j + 1] = key;
int main() {
insertionSort(arr, n);
return 0;
Output:
5 6 11 12 13
SELECTION SORT:
Given a list of data to be sorted, we simply select the smallest item and place it in a
sorted list. These steps are then repeated until we have sorted all of the data.
In first step, the smallest element is search in the list, once the smallest element is
found, it is exchanged with the element in the first position.
Now the list is divided into
two parts. One is sorted list
other is unsorted list. Find
out the smallest element in
the unsorted list and it is
exchange with the starting
position of unsorted list,
after that it will added in to
sorted list.
This process is repeated until all the elements
are sorted. Ex: asked to sort a list on paper.
Step 1 - Select the first element of the list (i.e., Element at first position in the list).
Step 2: Compare the selected element with all the other elements in the list.
Step 3: In every comparision, if any element is found smaller than the selected element (for
Ascending order), then both are swapped.
Step 4: Repeat the same procedure with element in the next position in the list till the entire list
is sorted.
Algorithm:
Selection Sort Algorithm (With SMALLEST subroutine)
This algorithm works by repeatedly finding the smallest element from the unsorted part of the
array and swapping it with the first unsorted element.
1. Initialization: You begin by iterating from K = 1 to N-1. This means you start with the second
element and move towards the last element.
2. CALL SMALLEST: For each K, you call the SMALLEST function to find the smallest element
from index K to N.
3. SWAP: After finding the smallest element (at Loc), you swap it with the element at index K.
4. Exit: Once all the elements are processed, the array is sorted.
SMALLEST Algorithm
This subroutine finds the smallest element in the unsorted part of the array, starting from index K.
SMALLEST(ARR, K, N, Loc)
Step 1: [INITIALIZE] SET Min = ARR[K]
Step 2: [INITIALIZE] SET Loc = K
Step 3: Repeat for J = K+1 to N
IF Min > ARR[J]
SET Min = ARR[J]
SET Loc = J
[END OF IF]
[END OF LOOP]
Step 4: RETURN Loc
Steps in Detail:
1. Initialization: Set Min to the value at ARR[K] and Loc to K (the initial position of the smallest
element).
2. Find Minimum: Iterate from J = K+1 to N and compare each element with Min. If an element is
smaller than Min, update Min and set Loc to the new index J.
3. Return: Once the loop completes, return the location Loc of the smallest element found.
Final Flow:
1. The main algorithm (SELECTION SORT) iterates through the array and calls SMALLEST to find the
minimum element in the unsorted part.
2. Once the smallest element is found, it's swapped with the element at position K (i.e., the first
unsorted element).
3. The process repeats until the entire array is sorted.
Example_1
Example 2:
Example : Consider the elements 23,78,45,8,32,56
……………………………………………………………………………………………….
EXAMPLE PROGRAM FOR Selection Sort:
#include <stdio.h>
int main() {
int arr[] = {64, 25, 12, 22, 11};
int n = sizeof(arr) / sizeof(arr[0]);
selectionSort(arr, n);
Output:
11 12 22 25 64
MERGE SORT:
Example 2:
Merge Algorithm:
Step 1: set i,j,k=0
Step 2: if A[i]<B[j] then
copy A[i] to C[k] and increment i and k
else
copy B[j] to C[k] and increment j and k
Step 3: copy remaining elements of either A or B into Array C.
MergeSort Algoritm:
MergeSort(A, lb, ub )
{
If lb<ub
{
mid =
floor(lb+ub)/2;
mergeSort(A, lb,
mid)
mergeSort(A,
mid+1, ub)
merge(A, lb, ub ,
mid)
}
}
Program :
#include <stdio.h>
// Function to merge two subarrays
void merge(int arr[], int left, int mid, int right) {
int n1 = mid - left + 1;
int n2 = right - mid;
// Temporary arrays
int leftArr[n1], rightArr[n2];
// Copy data into temporary arrays
for (int i = 0; i < n1; i++) leftArr[i] = arr[left + i];
for (int i = 0; i < n2; i++) rightArr[i] = arr[mid + 1 + i];
// Merge the temporary arrays back into the original array
int i = 0, j = 0, k = left;
while (i < n1 && j < n2) {
if (leftArr[i] <= rightArr[j]) arr[k++] = leftArr[i++];
else arr[k++] = rightArr[j++];
}
// Copy remaining elements of leftArr, if any
while (i < n1) arr[k++] = leftArr[i++];
// Copy remaining elements of rightArr, if any
while (j < n2) arr[k++] = rightArr[j++];
}
// Function to implement merge sort
void mergeSort(int arr[], int left, int right) {
if (left < right) {
int mid = left + (right - left) / 2;
Quick Sort:
Quick sort is a fast sorting algorithm used to sort a list of elements. Quick sort algorithm is
invented by C. A. R. Hoare.
The quick sort algorithm attempts to separate the list of elements into two parts and then
sort each part recursively. That means it use divide and conquer strategy. In quick sort, the
partition of the list is performed based on the element called pivot. Here pivot element is one
of the elements in the list.
The list is divided into two partitions such that "all elements to the left of pivot are smaller than the
pivot and all elements to the right of pivot are greater than or equal to the pivot".
Step 1 - Consider the first element of the list as pivot (i.e., Element at first
position in the list).
Step 2 - Define two variables i and j. Set i and j to first and last elements of the
list respectively.
Step 3 - Increment i until list[i] > pivot then stop.
Step 4 - Decrement j until list[j] < pivot then stop.
Step 5 - If i < j then exchange list[i] and list[j].
Step 6 - Repeat steps 3,4 & 5 until i > j.
Step 7 - Exchange the pivot element with list[j] element.
Following is the sample code for Quick sort...
#include<stdio.h>
void quick_sort(int a[],int low,int high);
int main()
{
int i,n,a[30];
printf("enter the size of array:");
scanf("%d",&n);
printf("enter the values of array:");
for(i=0;i<n;i++)
{
scanf("%d",&a[i]);
}
quick_sort(a,0,n-1);
for(i=0;i<n;i++)
{
printf("\n%d",a[i]);
}
}
void quick_sort(int a[30],int low,int high)
{
int i,j,pivot,temp;
if(low<high)
{
pivot=low;
i=low;
j=high;
while(i<j)
{
while(a[i]<a[pivot] && i<high)
i++;
while(a[j]>a[pivot])
j--;
if(i<j)
{
temp=a[i];
a[i]=a[j];
a[j]=temp;
}
}
temp=a[pivot];
a[pivot]=a[j];
a[j]=temp;
quick_sort(a,low,j-1);
quick_sort(a,j+1,high);
}
}
OUTPUT :
23
32
54
76
98
65
23
32
54
65
76
98
Example2:
Complexity of the Quick Sort Algorithm
To sort an unsorted list with 'n' number of elements, we need to make ((n-1)+(n-2)+(n-3)+......+1) = (n
(n-1))/2 number of comparisions in the worst case. If the list is already sorted, then it
requires 'n' number of comparisions.
Worst Case : O(n2)
Best Case : O (n log n)
Average Case : O (n log n)
Example2:
Time Complexities All the Searching & Sorting Techniques: