DSU (22317) - Chapter 1 Notes
DSU (22317) - Chapter 1 Notes
❖ Introduction
Data Structure can be defined as the group of data elements which provides an
efficient way of storing and organising data in the computer so that it can be used efficiently.
• In other words
Some examples of Data Structures are arrays, Linked List, Stack, Queue, etc. Data
Structures are widely used in almost every aspect of Computer Science i.e. Operating System,
Compiler Design, Artifical intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms as they enable the
programmers to handle the data in an efficient way. It plays a vital role in enhancing the
performance of a software or a program as the main function of the software is to store and
retrieve the user's data as fast as possible
Data structures are essential for managing large amounts of data, such as information kept in
databases or indexing services, efficiently. Proper maintenance of data systems requires the
identification of memory allocation, data interrelationships and data processes, all of which
data structures help with.
As applications are getting complexed and amount of data is increasing day by day, there
may arrise the following problems:
Processor speed: To handle very large amout of data, high speed processing is required, but
as the data is growing day by day to the billions of files per entity, processor may fail to deal
with that much amount of data.
Data Search: Consider an inventory size of 106 items in a store, If our application needs to
search for a particular item, it needs to traverse 106 items every time, results in slowing down
the search process.
Multiple requests: If thousands of users are searching the data simultaneously on a web
server, then there are the chances that a very large server can be failed during that process
In order to solve the above problems, data structures are used. Data is organized to form a
data structure in such a way that all items are not required to be searched and required data
can be searched instantly.
Efficiency: Efficiency of a program depends upon the choice of data structures. For example:
suppose, we have some data and we need to perform the search for a perticular record. In that
case, if we organize our data in an array, we will have to search sequentially element by
element. hence, using array may not be very efficient here. There are better data structures
which can make the search process efficient like ordered array, binary search tree or hash
tables.
Reusability: Data structures are reusable, i.e. once we have implemented a particular data
structure, we can use it at any other place. Implementation of data structures can be compiled
into libraries which can be used by different clients.
Abstraction: Data structure is specified by the ADT which provides a level of abstraction.
The client program uses the data structure through interface only, without getting into the
implementation details.
1) Traversing: Every data structure contains the set of data elements. Traversing the data
structure means visiting each element of the data structure in order to perform some specific
operation like searching or sorting.
2) Insertion: Insertion can be defined as the process of adding the elements to the data
structure at any location.
If the size of data structure is n then we can only insert n-1 data elements into it.
3) Deletion:The process of removing an element from the data structure is called Deletion.
We can delete an element from the data structure at any random location.
If we try to delete an element from an empty data structure then underflow occurs.
4) Searching: The process of finding the location of an element within the data structure is
called Searching. There are two algorithms to perform searching, Linear Search and Binary
Search. We will discuss each one of them later in this tutorial.
5) Sorting: The process of arranging the data structure in a specific order is known as
Sorting. There are many algorithms that can be used to perform sorting, for example,
insertion sort, selection sort, bubble sort, etc.
6) Merging: When two lists List A and List B of size M and N respectively, of similar type
of elements, clubbed or joined to produce the third list, List C of size (M+N), then this
process is called merging
“A data structure is a linear data structure if its data items(elements) form a sequence or a
linear list”.
Primitive Data Structures are the basic data structures that directly operate upon the machine
instructions.
They have different representations on different computers.
Example: Integers, Floating point numbers, Character constants, String
constants and Pointers come under this category.
Unlike linear data structures, elements of a non-linear data structure do not form a sequence
because of that they can be traversed in any desired order or non-sequential order. Examples:
Tree, Graph etc.
Non-primitive data structures are more complicated data structures and are derived
from primitive data structures.
They emphasize on grouping same or different data items with relationship between each
data item.
The data structure where data items are organized sequentially or linearly where data
elements attached one after another is called linear data structure.
These kind of data structures are very easy to implement because memory of computer is also
organized in linear fashion.
Examples of linear data structures are Arrays, Stack, Queue and Linked List.
Non Linear Data Structures: The data structure where data items are not organized
sequentially is called non linear data structure.
❖ What is an Algorithm ?
Algorithm is not the complete code or program, it is just the core logic(solution) of a
problem, which can be expressed either as an informal high level description
as pseudocode or using a flowchart.
1. Step 1: Start
2. Step 2: Declare variables num1, num2 and sum.
3. Step 3: Read values num1 and num2.
4. Step 4: Add num1 and num2 and assign the result to sum.
5. sum←num1+num2
6. Step 5: Display sum
7. Step 6: Stop
❖ Analysis of Algorithms
In computer science, analysis of algorithms is a very crucial part. It is important to find the
most efficient algorithm for solving a problem. It is possible to have many algorithms to
solve a problem, but the challenge here is to choose the most efficient one.
Now the point is, how can we recognize the most efficient algorithm if we have a set of
different algorithms? Here, the concept of space and time complexity of algorithms comes
into existence. Space and time complexity acts as a measurement scale for algorithms. We
compare the algorithms on the basis of their space (amount of memory) and time complexity
(number of operations)
An algorithm is said to be efficient and fast, if it takes less time to execute and consumes less
memory space.
Sometimes, there are more than one way to solve a problem. We need to learn how to
compare the performance different algorithms and choose the best one to solve a particular
problem. While analyzing an algorithm, we mostly consider time complexity and space
complexity
1. Time Complexity
2. Space Complexity
Time Complexity
Time Complexity is a way to represent the amount of time required by the program to run till
its completion. It's generally a good practice to try to keep the time required minimum, so
that our algorithm completes it's execution in the minimum time possible.
Time complexity of an algorithm signifies the total time required by the program to run till its
completion.
The time complexity of algorithms is most commonly expressed using the big O notation.
It's an asymptotic notation to represent the time complexity.
Asymptotic Notations
Asymptotic Notations are the expressions that are used to represent the complexity of an
algorithm.
Best Case: In which we analyse the performance of an algorithm for the input, for which the
algorithm takes less time or space.
Worst Case: In which we analyse the performance of an algorithm for the input, for which
the algorithm takes long time or space.
Average Case: In which we analyse the performance of an algorithm for the input, for which
the algorithm takes time or space that lies between best and worst case.
1. Big-O Notation (Ο) – Big O notation specifically describes worst case scenario.
2. Omega Notation (Ω) – Omega(Ω) notation specifically describes best case scenario.
3. Theta Notation (θ) – This notation represents the average complexity of an algorithm.
O(expression) is the set of functions that grow slower than or at the same rate as expression.
It indicates the maximum required by an algorithm for all input values. It represents the worst
case of an algorithm's time complexity.
Omega(expression) is the set of functions that grow faster than or at the same rate as
expression. It indicates the minimum time required by an algorithm for all input values. It
represents the best case of an algorithm's time complexity.
Theta(expression) consist of all the functions that lie in both O(expression) and
Omega(expression). It indicates the average bound of an algorithm. It represents the average
case of an algorithm's time complexity.
Describes algorithms that take the same amount of time to compute regardless of the input
size.
For instance, if a function takes the identical time to process ten elements as well as 1 million
items, then we say that it has a constant growth rate or O(1).
Linear time complexity O(n) means that as the input grows, the algorithms take
proportionally longer to complete.
A function with a quadratic time complexity has a growth rate of n2. If the input is size 2, it
will do four operations. If the input is size 8, it will take 64, and so on.
Time complexity of nested loops is equal to the number of times the innermost statement is
executed. For example the following sample loops have O(n2) time complexity
• Sorting items in a collection using bubble sort, insertion sort, or selection sort.
An algorithm is said to have an exponential time complexity when the growth doubles with
each addition to the input data set. This kind of time complexity is usually seen in brute-force
algorithms.
N –Input size
Example 1:
1.int i=1
2.loop(i<=n)
3.print i
4.i=i+1
int i=1 1
loop(i<=n) n +1
print i n
i=i+1 n
Example 2:
Example 3:
Thus frequency count is 3n+4.Neglating the constants and by considering the order of
magnitude we will get 0(n) as run time complexity
Space Complexity
Its the amount of memory space required by the algorithm, during the course of its execution.
Space complexity must be taken seriously for multi-user systems and in situations where
limited memory is available.
❖ Space needed by an algorithm is equal to the sum of the following two components
A fixed part that is a space required to store certain data and variables (i.e. simple variables
and constants, program size etc.), that are not dependent of the size of the problem.
A variable part is a space required by variables, whose size is totally dependent on the size
of the problem. For example, recursion stack space, dynamic memory allocation etc.
Space complexity S(p) of any algorithm p is S(p) = A + Sp(I) Where A is treated as the fixed
part and S(I) is treated as the variable part of the algorithm which depends on instance
characteristic I. Following is a simple example that tries to explain the concept
Algorithm
SUM(P, Q)
Step 1 - START
Step 2 - R ← P + Q + 10
Step 3 - Stop
Here we have three variables P, Q and R and one constant. Hence S(p) = 1+3. Now space is
dependent on data types of given constant types and variables and it will be multiplied
accordingly.
Example 1:
1. int i=1
2. loop(i<=n)
3. print i
4. i=i+1
Output:
Let n=5
12345
Example 2:
Abstract Data Type(ADT) is a data type, where only behaviour is defined but not
implementation.
In computer science, an abstract data type (ADT) is a mathematical model for data types. An
abstract data type is defined by its behavior (semantics) from the point of view of a user, of
the data, specifically in terms of possible values, possible operations on data of this type, and
the behavior of these operations.
Abstract Data Type(ADT) is a data type, where only behaviour is defined but not
implementation.
• ADT is composed of
• -A Collection of data