Dsa-Module 1 Lecture Notes
Dsa-Module 1 Lecture Notes
Data type: Data type specifies the type of data stored in a variable.
Example: data types would be integer, floats, string and characters.
Built-in data type: In every programming language there is a set of data types called
built in data type.
Example: Pascal: Integers, real, char, etc
C: int, float
Abstract Data type (ADT): An abstract data type is defined as a mathematical model of
the data objects that make up a data type as well as the functions that operate on these
objects.
Example: Lists, stacks and graphs are examples of ADT along with their operations.
Data structures are classified into two types: They are Primitive and non-primitive Data
structures.
1.2.1 Primitive Data structure:
The primitive data types are the basic data types that are available in most of the programming
languages. The primitive data types are used to represent single values.
Integer: This is used to represent a number without decimal point.
Eg: 12, 90
Float and Double: This is used to represent a number with decimal point.
Eg: 45.1, 67.3
Character : This is used to represent single character
Eg: ‘C’, ‘a’
String: This is used to represent group of characters.
Eg: "M.S.P.V.L Polytechnic College"
Boolean: This is used represent logical values either true or false.
b) Linked list: A linked list is a way to store a collection of elements. Each element in a
linked list is stored in the form of a node. A data part stores the element and a next part
stores the link to the next node.
Linked List:
c) Stack: Stack is a linear data structure which follows a particular order in which the operations are
performed. The order may be LIFO (Last In First Out) or FILO (First In Last Out).
f) Graphs: Graphs are used to represent networks. Graph is a data structure that consists of
following two components:
1. A finite set of vertices also called as nodes.
2. A finite set of ordered pair of the form (u, v) called as edge. The pair of the form (u, v)
indicates that there is an edge from vertex u to vertex v.
IMPLEMENTATION OF DATA STRUCTURES
ARRAY MPLEMENTATION
ALGORITHM SPECIFICATIONS
ALGORITHM:
Definiteness: Each step of an algorithm must be precisely defined. Meaning the step should
perform a clearly defined task without much complication.
Effectiveness: The efficiency of the steps and the accuracy of the output determine the
effectiveness of the algorithm.
An Algorithm is expressed generally as flow chart or as an informal high level description called
as pseudocodeAlgorithm can be defined as “a sequence of steps to be performed for getting the
desired output for a given input.”
In general, the steps in an algorithm can be divided into three basic categories as:
a) Sequence algorithm
b) Selection algorithm
c) Iteration algorithm
a) Sequence algorithm:
A sequence algorithm is a series of steps in sequential order without any break. Here,
instructions are executed from top to bottom without any disturbances.
b) Selection algorithm:
Steps of an algorithm are designed by selecting appropriate condition checking is called
as selection algorithms. Selection algorithms are designed using selection control statements
such as IF, IF-ELSE, Nested IF-ELSE, ELSE-IF Ladder and SWITCH statements.
c) Iteration algorithm:
Steps of an algorithm are designed based on certain conditions and repeatedly processed
the same statements until the specified condition becomes false is called as iteration algorithms.
Iteration algorithms are designed using iterative control statements such as WHILE, D0-WHILE
and FOR statements.
Example: Algorithm for reverse of a given
number Step 1: START
Step 2: READ n value
Step 3: rev ← 0
Step 4: Repeat WHILE n > 0
k ← n MOD 0
rev ← rev * 10 + k
n ← n / 10
EndRepeat
Step 5: WRITE rev
Step 6: STOP
ANALYSIS OF AN ALGORITHM
Analysis of algorithms (or) performance analysis refers to the task of determining how much
computing time (time complexity) and storage (space complexity) of an algorithm requires.
Algorithm efficiency describes the properties of an algorithm which relates to the amount
of resources used. An algorithm must be analyzed to determine its resources usage.
The time complexity of an algorithm is the amount of computer time it needs to run for
its completion. The space complexity of an algorithm is the amount of memory it needs to run
for its completion.
These complexities are calculated based on the size of the input. With this, analysis can
be divided into three cases as:
Best case analysis: In best case analysis, problem statement takes minimum
number of computations for the given input parameters.
Worst case analysis: In worst case analysis, problem statement takes maximum
number of computations for the given input parameters.
Average case analysis: In average case analysis, problem statement takes average
number of computations for the given input parameters.
Based on the size of input requirements, complexities can be varied. Hence, exact
representation of time and space complexities is not possible. But they can be shown in some
approximate representation using mathematical notations known as asymptotic notations.
SPACE COMPLEXITY
The process of estimating the amount of memory space to run for its completion is
known as space complexity.
Space complexity S(P) of any problem P is sum of fixed space requirements and variable
space requirements as:
1. Fixed space that is independent of the characteristics (Ex: number, size) of the input and
outputs. It includes the instruction space, space for simple variables and fixed-size
component variables, space for constants and so on.
2. Variable space that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by the
referenced variables and the recursion stack space.
When analyzing the space complexity of any problem statement, concentrate solely on
estimating the variable space requirements. First determine which instance characteristics to use
to measure the space requirements. Hence, the total space requirement S(P) of any program can
be represented as:
S (P) = C + SP (I)
Where,
C is a constant representing the fixed space requirements and I refer to instance
characteristics.
The process of estimating the amount of computing time to run for its completion is
known as time complexity.
The time T(P) taken by a program P is the sum of its compile time and its run time.
Here,
Compile time is a fixed component and does not depends on the instance characteristics.
Hence,
T(P) = C + TP (Instance characteristics)
Where, C is a fixed constant value
T(P) ≥ TP(I)
Where, I refer instance characteristic.
Here, count variable is incremented by twice one for addition operation and one for
return statement.
Therefore Tsum = 2.
ASYMPTOTIC NOTATIONS
Asymptotic Notations are languages that allow us to analyze an algorithm’s running time
by identifying its behavior as the input size for the algorithm increases.
For example, the running time of one operation is computed as f(n) and may be for
another operation it is computed as g(n). Usually, the time required by an algorithm falls under
three types −
Best Case − Minimum time required for program execution.
Average Case − Average time required for program execution.
Worst Case − Maximum time required for program execution.
Following are the commonly used asymptotic notations to calculate the running time
complexity of an algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's
running time. It measures the worst case time complexity or the longest amount of time
an algorithm can possibly take to complete.
If f(n) <= C g(n)for all n >= n0, C > 0 . Then we can represent f(n) as O(g(n)).
Chapter 2 f(n) = O(g(n))
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's
running time. It measures the best case time complexity or the best amount of time an
algorithm can possibly take to complete.
If f(n) >= C g(n)for all n >= n0, C > 0. Then we can represent f(n) as Ω(g(n)).
Chapter 3 f(n) = Ω(g(n))
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound
of an algorithm's running time. It is represented as follows −
If C1 g(n) <= f(n) >= C2 g(n) for all n >= n0, C1, C2 > 0. Then we can represent f(n) as Θ(g(n)).
Chapter 4 f(n) = Θ(g(n))
A space–time tradeoff can be applied to the problem of data storage. If data is stored
uncompressed, it takes more space but access takes less time than if the data were stored
compressed (since compressing the data reduces the amount of space it takes, but it takes time to
run the decompression algorithm).
A tradeoff is a situation where one thing increases and another thing decreases. It is a way to
solve a problem in:
Either in less time and by using more space, or
In very little space by spending a long amount of time.
The best Algorithm is that which helps to solve a problem that requires less space in memory
and also takes less time to generate the output. But in general, it is not always possible to
achieve both of these conditions at the same time. The most common condition is
an algorithm using a lookup table. This means that the answers to some questions for every
possible value can be written down. One way of solving this problem is to write down the
entire lookup table, which will let you find answers very quickly but will use a lot of space.
Another way is to calculate the answers without writing down anything, which uses very little
space, but might take a long time. Therefore, the more time-efficient algorithms you have, that
would be less space-efficient.
Types of Space-Time Trade-off
Compressed or Uncompressed data
Re Rendering or Stored images
Smaller code or loop unrolling
Lookup tables or Recalculation
Compressed or Uncompressed data: A space-time trade-off can be applied to the problem
of data storage. If data stored is uncompressed, it takes more space but less time. But if the
data is stored compressed, it takes less space but more time to run the decompression
algorithm. There are many instances where it is possible to directly work with compressed
data. In that case of compressed bitmap indices, where it is faster to work with compression
than without compression.
Re-Rendering or Stored images: In this case, storing only the source and rendering it as an
image would take more space but less time i.e., storing an image in the cache is faster than re-
rendering but requires more space in memory.
Smaller code or Loop Unrolling: Smaller code occupies less space in memory but it requires
high computation time that is required for jumping back to the beginning of the loop at the end
of each iteration. Loop unrolling can optimize execution speed at the cost of increased binary
size. It occupies more space in memory but requires less computation time.
Lookup tables or Recalculation: In a lookup table, an implementation can include the entire
table which reduces computing time but increases the amount of memory needed. It can
recalculate i.e., compute table entries as needed, increasing computing time but reducing
memory requirements.
ARRAYS
An array is a collection of data that holds fixed number of values of same type.
Example: int mark[5] = {40, 60, 80, 70, 90}
Declaring Arrays
Syntax: datatype array_name[size];
datatype: It denotes the type of the elements in the array.
array_name: it is the name given to an array.
size: It is the number of elements an array can hold.
1.4.1 Types of arrays:
There are two types of arrays. They are a) Single dimensional array b) Multi-dimensional array.
Single or One Dimensional array is used to represent and store data in a linear form.
Array having only one subscript variable is called One-Dimensional array
It is also called as Single Dimensional Array or Linear Array
Syntax: datatype array_name[size];
Example: int mark[5] = {40, 60, 80, 70, 90};
Operations on Arrays:
Algorithm:
Step 1: START
Step 2: Take an array A
Step 3: Define its values
Step 4: Loop for each value of A
Step 5: Display A[i] where i is the value of current iteration
Step 6: STOP
Insertion: It is used to add a new data item in the given collection of data items.
Example: Consider linear array A as below:
1 2 3 4 5
10 20 50 30 15
New element to be inserted is 100 and location for insertion is 3. So shift the elements from 5th
location to 3rd location downwards by 1 place. And then insert 100 at 3rd location. It is shown
below:
Algorithm:
Let A be a Linear Array (unordered) with N elements and K is a positive integer such
that K<=N. Following is the algorithm where ITEM is inserted into the Kth position of LA
Algorithm
1. Start
2. Set J = N
3. Set N = N+1
4. Repeat steps 5 and 6 while J >= K
5. Set A[J+1] = A[J]
6. Set J = J-1
7. Set A[K] = ITEM
8. Stop
3. Deletion: It is used to delete an existing data item from the given collection of data items.
Example:
1 2 3 4 5
10 20 50 40 25 60
The element to be deleted is 50 which is at 3rd location. So shift the elements from 4th to 6th
location upwards by 1 place. It is shown below:
1 2 3 4 5 6
10 20 40 25 60
Algorithm:
Consider A is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of L.
1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set A[J] = A[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop
Suppose item to be searched is 20. We will start from beginning and will compare 20 with each
element. This process will continue until element is found or array is finished. Here:
1) Compare 20 with 15
20 # 15, go to next element.
2) Compare 20 with 50
20 # 50, go to next element.
3) Compare 20 with 35
20 #35, go to next element.
4) Compare 20 with 20
20 = 20, so 20 is found and its location is 4.
Algorithm
Consider A is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential search.
1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF A[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop
5. Sorting: It is used to arrange the data items in some order i.e. in ascending or descending order
in case of numerical data and in dictionary order in case of alphanumeric data.
E.g.
1 2 3 4 5
10 50 40 20 30
After arranging the elements in increasing order by using a sorting technique, the array will be:
1 2 3 4 5
10 20 30 40 50
6. Merging -- It is used to combine the data items of two sorted files into single file in the sorted
form.
Multi-Dimensional Arrays
A multi-dimensional array is an array that has more than one dimension. It is an array of
arrays; an array that has multiple levels. 2-Dimensional array and 3-Dimensional array are
examples of Multi-Dimensional arrays.
Syntax: datatype array_name[size 1][size 2]…..[size n];
a) Two-Dimensional Array:
A 2D array is also called a matrix, or a table of rows and columns. The simplest form of
multidimensional array is the two-dimensional array. Syntax of two-dimensional array:
Row-major order: In row-major order, elements of a matrix are stored on a row-by-row basis.
Column-major order: column-major order. elements are stored column-by-column.
Example: A two dimensional array ‘a’ of type int with 2 rows and 3 columns can be defined as:
int a[2][3];
This array will contain 2x3(6) elements and they can be represented as:
The above 2D array can also represented as: array with 3 rows and each row has 4 columns.
int a[3][4] = {
};
one-dimensional multi-dimensional
1) 1D Stores single list of elements of 1) Usually, 2D and 3D are used in Multi-
similar data type. dimensional. 2D Stores 'array of arrays'
3D Stores ‘array of array of arrays’
2) Syntax: Datatype arrayname[size]; 2) Datatype arrayname[size1] [size2];
3) Representation of 1D: 3)
{4,5,6,7},
{8,9,10,11}
};
MODULE 1 PART II
List search
There are two types of list search. They are a) sequential (linear) search b) Binary search
5.1 Sequential search (linear search):
Linear search is a very simple search algorithm.
In this, searching starts from beginning of an array and compares each element with the
given element and continues until the desired element is found or end of the array is
reached.
Linear search is used for small and unsorted arrays.
Example:
Algorithm:
Step 1: Linear Search ( Array A, Value x)
Step 2: Set i=1
Step 3: if (A[i] == x) then
Print “search is successful and x is found at index i”
stop
Step 4: else
i=i+1
if ( i ≤ n ) then go to step 3
Step 5: else
Print “unsuccessful”
stop
Example:
Pseudocode/ Algorithm:
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if
Binary search: In binary search no need of searching entire list because of if target element is
greater than mid value search only right of the list. if target is less than mid value search only left
half the list.
Step 1: Binarysearch(list, key, low, high)
Step 2: if (low ≤ high) then
Mid = (low + high)/2
Step 3: if(List[mid] = key) then
return mid // search success
Step 4: else if(key < list[mid]) then
return Binarysearch(list, key, low, mid-1) // search left sub-list
Step 5: else return Binarysearch(list,key,mid + 1, high) // search right sub-list
Step 6: end if
Step 7: end if
From the above comparison, binary search algorithm has less number of comparisons and
it is more efficient than linear search algorithm.
Compare Linear and binary search techniques
Linear search and binary search are two common algorithms used to search for a specific element in a list or array.
They differ in terms of their approach, efficiency, and suitability for different scenarios. Here's a comparison of the
two:
Search Algorithm:
Linear Search: Linear search, also known as sequential search, involves checking each element in the list one by one
until the target element is found or the end of the list is reached.
It starts from the beginning of the list and proceeds sequentially.
Binary Search: Binary search is a divide-and-conquer algorithm that is applicable only to sorted lists or arrays.
It repeatedly divides the search interval in half by comparing the target element with the middle element, eliminating
half of the remaining elements at each step.
Time Complexity:
Linear Search: In the worst-case scenario, when the target element is at the end of the list or not present at all, linear
search has a time complexity of O(n), where n is the number of elements in the list.
In the average case, the time complexity is also O(n).
Binary Search: Binary search has a much more efficient time complexity. In the worst-case scenario, it has a time
complexity of O(log n), where n is the number of elements in the list.
This means that binary search is significantly faster for large sorted lists compared to linear search.
Space Complexity:
Linear Search: Linear search has a space complexity of O(1) since it only requires a few variables to keep track of the
current element and the target element.
Binary Search: Binary search typically has a space complexity of O(1) as well, as it does not require additional data
structures other than a few variables for indices and values.
Applicability: