0% found this document useful (0 votes)
28 views51 pages

DS CHP1 Notes

The document provides an overview of data structures, defining them as organized ways to store and access data efficiently. It categorizes data structures into linear (e.g., Stack, Queue, Array) and non-linear (e.g., Tree, Graph), explaining their characteristics and operations. Additionally, it discusses algorithm complexity, including time and space complexity, and introduces asymptotic notation for analyzing algorithm performance.

Uploaded by

KADAM NIRAJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views51 pages

DS CHP1 Notes

The document provides an overview of data structures, defining them as organized ways to store and access data efficiently. It categorizes data structures into linear (e.g., Stack, Queue, Array) and non-linear (e.g., Tree, Graph), explaining their characteristics and operations. Additionally, it discusses algorithm complexity, including time and space complexity, and introduces asymptotic notation for analyzing algorithm performance.

Uploaded by

KADAM NIRAJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

DATA STRUCTURES

Introduction to Data Structure:-


Data Structure:- It is a particular way of organizing and storing data
in a computer. So that it can be accessed and modified efficiently. A
data structure will have a collection of data and functions or
operations that can be applied on the data.

Formal Definition of Data Structure:-


A data structure is a set of Domains D a designated domain d E D ,
a set of functions F and a set of axioms A the triple (D,F,A) denotes
the data structure
This Definition is also called the Abstract Data Type.
D :- Denotes the data objects
F :- Denotes the set of operations that can be carried out on the
data objects.
A:- Describes the properties and rules of the operations
DATA STRUCTURES
Data Object- Data Object refers to a set of elements (D) which may
be finite or infinite.
If the set is infinite we have to devise some mechanism to
represent that set in memory since available memory is limited
Ex A set of integer number is infinite.
D={0,+-1,+-2………………}
A set of alphabets is finite
D=(‘A’,’B’,……………………………………’Z’}
Data Type:- Data type is a term used to describe the information
type that can be processed by the computer and which is
supported by the programming language
It is also defined as a term which refers to the kinds of data that
variables may “ hold” in a programming language
Ex : int , char ,float ,etc.
Some language also allows users to combine built in data types Ex :-
record in pascal,structures & union in c.
What is a linear data structure?

A linear data structure is a type of data structure that


stores the data linearly or sequentially.
In the linear data structure, data is arranged in such a way
that one element is adjacent to its previous and the next
element.
It includes the data at a single level such that we can
traverse all data into a single run.

The implementation of the linear data structure is always


easy as it stores the data linearly.
The common examples of linear data types are Stack,
Queue, Array, LinkedList, and Hashmap, etc.
1) Stack
Users can push/pop data from a single end of the stack. Users
can insert the data into the stack via push operation and
remove data from the stack via pop operation. The stack
follows the rule of LIFO (last in first out). Users can access all
the stack data from the top of the stack in a linear manner.

In real-life problems, the stack data structure is used in many


applications. For example, the All web browser uses the stack
to store the backward/forward operations.
2. Queue
Queue data structure stores the data in a linear sequence. Queue
data structure follows the FIFO rule, which means first-in-first-out.
It is similar to the stack data structure, but it has two ends. In the
queue, we can perform insertion operations from the rear using
the Enqueue method and deletion operations from the front
using the deque method.

Escalator is one of the best real-life examples of the queue.


3. Array
The array is the most used Linear data type.
The array stores the objects of the same data type in a linear
fashion. Users can use an array to construct all linear or non-
linear data structures.
For example, Inside the car management software to store the
car names array of the strings is useful.

We can access the element of the array by the index of elements.


In an array, the index always starts at 0. To prevent memory
wastage, users should create an array of dynamic sizes.
4. LinkedList
Linked List data structure stores the data in the form of a node.
Every linked list node contains the element value and address
pointer.
The address pointer of the Linked List consists of the address of
the next node. It can store unique or duplicate elements.
What is a non-linear data structure?

In a non-linear data structure, data is connected to its


previous, next, and more elements like a complex
structure. In simple terms, data is not organized
sequentially in such a type of data structure.
It is one type of Non-primitive data structure.
In non-linear data structures, data is not stored linear
manner. There are multiple levels of nonlinear data
structures. It is also called a multilevel data structure. It is
important to note here that we can't implement non-linear
data structures as easily as linear data structures.
1. Tree
A tree data structure is an example of a nonlinear data
structure. It is a collection of nodes where each node
consists of the data element. Every tree structure contains
a single root node.
In the tree, every child node is connected to the parent
node. Every parent and child node has a parent-child node
relationship. In the tree, Nodes remaining at the last level
are called leaf nodes, and other nodes are called internal
nodes.
Types of trees:
1)Binary tree
2)Binary search tree
3)AVL tree
4)Red-black tree

The Heap data structure is also a non-linear tree-based data


structure, and it uses the complete binary tree to store the
data.
2. Graph
The graph contains vertices and edges.
Every vertex of the graph stores the data element. Edges are used
to build a vertices relationship. Social networks are real-life
examples of graph data structure.
Here, we can consider the user as a vertex, and the
relation between him/her with other users is called
an edge. There is no limitation for building the
relationship between any two vertexes of the graph
like a tree. We can implement the graph data
structure using the array or LinkedList.
Types of the graph:
1)Simple Graph
2)Un-directed Graph
3)Directed Graph
4)Weighted Graph
TYPES OF DATA STRUCTURES :-
1. Primitive and Non-Primitive Data Structure
Primitive Data Structure defines a set of primitive
elements that do not involve any other elements as its
subparts.

These are generally built-in data type in programming


languages.

E.g.:- Integers, Characters etc.

Non-Primitive Data Structures are that defines a set of


derived elements such as Arrays, Structures and
Classes.
2. Linear and Non-Linear Data Structure

A Data Structure is said to be linear, if its elements from a


sequence and each element have a unique successor and
predecessor.

E.g.:- Stalk, Queue etc.


Non-Linear Data Structures are used to represent data that
have a hierarchical relationship among the elements.

In non-linear Data Structure every element has more than one


predecessor and one successor.
E.g.:- Trees, Graphs
3. Static and Dynamic Data Structure
A Data Structure is referred as Static Data Structure, if it is
created before program execution i.e., during compilation
time.
The variables of Static Data Structure have user specified
name.

E.g.:- Array Data Structures that are created at run time are
called as Dynamic Data Structure.

The variables of this type are known always referred by user


defined name instead using their addresses through pointers.
E.g.:- Linked List
4. Sequential and Direct Data Structure

This classification is with respect to access operation


associated with the data type.
Sequential access means the locations are accessed in a
sequential form.
E.g.:- To access 𝒏 𝒕𝒉 element, it must access preceding 𝒏 − 𝟏
data elements
E.g.:- Linked List Direct Access means any element can access
directly without accessing its predecessor or successor i.e., 𝒏
𝒕𝒉 element can be accessed directly.
E.g.:- Array
OPERATIONS ON DATA STRUCTURE
The basic operations that can be performed on a Data Structure
are:
i) Insertion: - Operation of storing a new data element in a
Data Structure.
ii) Deletion: - The process of removal of data element from a
Data Structure.
iii) Traversal: - It involves processing of all data elements
present in a Data Structure.
iv) Merging: - It is a process of compiling the elements of two
similar Data Structures to form a new Data Structure of the
same type.
v) Sorting: - It involves arranging data element in a Data
Structure in a specified order.
vi) Searching: - It involves searching for the specified data
element in a Data Structure.
Factor Linear Data Structure Non-Linear Data Structure
Data Element Arrangement In a linear data structure, data In a non-linear data structure, data
elements are sequentially elements are hierarchically
connected, allowing users to connected, appearing on multiple
traverse all elements in one run. levels.

Implementation Complexity Linear data structures are relatively Non-linear data structures require a
easier to implement. higher level of understanding and
are more complex to implement.

Levels All data elements in a linear data Data elements in a non-linear data
structure exist on a single level. structure span multiple levels.

Traversal A linear data structure can be Traversing a non-linear data


traversed in a single run. structure is more complex, requiring
multiple runs.
Memory Utilization Linear data structures do not Non-linear data structures are more
efficiently utilize memory. memory-friendly.
Time Complexity The time complexity of a linear data The time complexity of a non-linear
structure is directly proportional to data structure often remains
its size, increasing as input size constant, irrespective of its input
increases. size.
Applications Linear data structures are ideal for Non-linear data structures are
application software development. commonly used in image processing
and Artificial Intelligence.

Examples Linked List, Queue, Stack, Array. Tree, Graph, Hash Map.
COMPLEXITY OF ALGORITHM
Generally algorithms are measured in terms of time complexity and
space complexity.
1. Time Complexity

Time Complexity : - Total time taken to perform the task


Time Complexity of an algorithm is a measure of how much time is
required to execute an algorithm for a given number of inputs.
And it is measured by its rate of growth relative to a standard
function.
Time Complexity can be calculated by adding compilation time and
execution time.
Or it can do by counting the number of steps in an algorithm.
2. Space Complexity :-

Space Complexity :- The amount of memory needed to


perform the task.

Space Complexity of an algorithm is a measure of how


much storage is required by the algorithm.
Thus space complexity is the amount of computer
memory required during program execution as a
function of input elements.
The space requirement of algorithm can be performed
at compilation time and run time.
Definition :
The word Algorithm means
” A set of finite rules or instructions to be followed in
calculations or other problem-solving operations ”
Or
” A procedure for solving a mathematical problem in
a finite number of steps that frequently involves
recursive operations”.
Characteristics of an Algorithm
1) Input and output: Algorithms take inputs, which might be the preliminary records
or facts furnished to the algorithm, and produce outputs, which are the results or
solutions generated by using the set of rules after processing the inputs.
The relation among the inputs and outputs is decided by means of the algorithm's
good judgment.
2) Finiteness: Algorithms must have a well-defined termination condition.
This method means that they finally attain an endpoint or change after a finite
quantity of steps. If a set of rules runs indefinitely without termination, it's far taken
into consideration wrong or incomplete.
3) Determinism: Algorithms are deterministic, that means that given the same
inputs and achieved below the same conditions, they may continually produce the
identical outputs. The conduct of a set of rules ought to be predictable and regular.
4) Efficiency: Algorithms attempt to be efficient in phrases of time and sources. They
goal to clear up issues or perform obligations in an inexpensive quantity of time and
with ultimate use of computational sources like memory, processing power, or
garage.
5)Correctness: Algorithms must be designed to produce correct results for all
legitimate inputs inside their domain. They must accurately solve the problem they
may be designed for, and their outputs must match the anticipated consequences.
Time and space complexity depends on lots of things like hardware, operating
system, processors, etc. However, we don't consider any of these factors while
analyzing the algorithm.
We will only consider the execution time of an algorithm.
Lets start with a simple example.
Suppose you are given an array 𝐴 and an integer 𝑥 and you have to find if 𝑥 exists
in array 𝐴.
Simple solution to this problem is traverse the whole array 𝐴 and check if the any
element is equal to 𝑥.

for i : 1 to length of A
if A[i] is equal to x return
TRUE return FALSE
Each of the operation in computer take approximately constant time. Let each
operation takes 𝑐 time. The number of lines of code executed is actually depends
on the value of 𝑥. During analyses of algorithm, mostly we will consider worst
case scenario, i.e., when 𝑥 is not present in the array 𝐴. In the worst case,
the if condition will run 𝑁 times where 𝑁 is the length of the array 𝐴. So in the
worst case, total execution time will be (𝑁∗𝑐+𝑐). 𝑁∗𝑐 for the if condition and 𝑐 for
the return statement ( ignoring some operations like assignment of 𝑖 ).
As we can see that the total time depends on the length of the array 𝐴. If the
length of the array will increase the time of execution will also increase.
Order of growth
Order of growth is how the time of execution depends on the length
of the input, we can clearly see that the time of execution is linearly
depends on the length of the array.
Order of growth will help us to compute the running time with
ease. We will ignore the lower order terms, since the lower order
terms are relatively insignificant for large input. We use different
notation to describe limiting behavior of a function.
Asymptotic Notation
It is used to describe the running time of an algorithm - how much
time an algorithm takes with a given input, n.
There are three different notations:
1) big O, 2)big Theta (Θ), and 3)big Omega (Ω).
1)Time complexity notations
While analysing an algorithm, we mostly consider 𝑂-notation because it will give us an
upper limit of the execution time i.e. the execution time in the worst case.
To compute 𝑂-notation we will ignore the lower order terms, since the lower order
terms are relatively insignificant for large input.
Let 𝑓(𝑁)=2∗𝑁2+3∗𝑁+5
𝑂(𝑓(𝑁))=𝑂(2∗𝑁2+3∗𝑁+5)=𝑂(𝑁2)
Lets
intconsider
count =some0; example:
for (int i = 0; i < N; i++)
for (int j = 0; j < i; j++) count++;

Lets see how many times count++ will run.


When 𝑖=0, it will run 0 times.
When 𝑖=1, it will run 1 times.
When 𝑖=2, it will run 2 times and so on.
Total number of times count++ will run
is 0+1+2+...+(𝑁−1)=𝑁∗(𝑁−1)2. So the time complexity will
be 𝑂(𝑁2).
int count = 0;
for (int i = N; i > 0; i /= 2)
for (int j = 0; j < i; j++) count++;

This is a tricky case.


In the first look, it seems like the complexity is 𝑂(𝑁∗𝑙𝑜𝑔𝑁). 𝑁 for
the 𝑗′𝑠 loop and 𝑙𝑜𝑔𝑁 for 𝑖′𝑠 loop. But its wrong. Lets see why.
Think about how many times count++ will run.
When 𝑖=𝑁, it will run 𝑁 times.
When 𝑖=𝑁/2, it will run 𝑁/2 times.
When 𝑖=𝑁/4, it will run 𝑁/4 times and so on.
Total number of times count++ will run is 𝑁+𝑁/2+𝑁/4+...+1=2∗𝑁.
So the time complexity will be 𝑂(𝑁).
The table below is to help you understand the growth of several
common time complexities, and thus help you judge if your algorithm
is fast enough to get an Accepted
( assuming the algorithm is correct ).
Method for Calculating Space and Time Complexity

Methods for Calculating Time Complexity


To calculate time complexity, you must consider each line of code in the
program.
Consider the multiplication function as an example. Now, calculate the time
complexity of the multiply function:

1. mul <- 1
2. i <- 1
3. While i <= n do
4. mul = mul * 1
5. i=i+1
6. End while
Let T(n) be a function of the algorithm's time complexity.
Lines 1 and 2 have a time complexity of O. (1).
Line 3 represents a loop.
As a result, you must repeat lines 4 and 5 (n -1) times.
As a result, the time complexity of lines 4 and 5 is O. (n).
Finally, adding the time complexity of all the lines yields the overall time
complexity of the multiple function fT(n) = O(n)
.
The iterative method gets its name because it calculates an iterative
algorithm's time complexity by parsing it line by line and adding the complexity.
Aside from the iterative method, several other concepts are used in various
cases.
The recursive process, for example, is an excellent way to calculate time
complexity for recurrent solutions that use recursive trees or substitutions.
The master's theorem is another popular method for calculating time
complexity.
Methods for Calculating Space Complexity
With an example, you will go over how to calculate space
complexity in this section.
Here is an example of computing the multiplication of
array elements:

1.int mul, i
2.While i < = n do
3. mul <- mul * array[i]
4. i <- i + 1
5.end while
6.return mul
Space Complexity Explanation :

Let S(n) denote the algorithm's space complexity. In most systems, an integer
occupies 4 bytes of memory.
As a result, the number of allocated bytes would be the space complexity.
Line 1 allocates memory space for two integers, resulting in S(n) = 4 bytes
multiplied by 2 = 8 bytes.

Line 2 represents a loop. Lines 3 and 4 assign a value to an already existing


variable. As a result, there is no need to set aside any space. The return statement
in line 6 will allocate one more memory case. As a result, S(n)= 4 times 2 + 4 = 12
bytes.

Because the array is used in the algorithm to allocate n cases of integers, the final
space complexity will be fS(n) = n + 12 = O (n).
As you progress through this tutorial, you will see some differences between
space and time complexity.
Q. Imagine a classroom of 100 students in which you gave your pen to one
person.
You have to find that pen without knowing to whom you gave it.

Here are some ways to find the pen and what the O order is.
•O(n2): You go and ask the first person in the class if he has the pen. Also, you ask
this person about the other 99 people in the classroom if they have that pen and so
on,
This is what we call O(n2).

•O(n): Going and asking each student individually is O(N).

•O(log n): Now I divide the class into two groups, then ask: “Is it on the left side, or
the right side of the classroom?” Then I take that group and divide it into two and
ask again, and so on. Repeat the process till you are left with one student who has
your pen. This is what you mean by O(log n).

I might need to do:


•The O(n2) searches if only one student knows on which student the pen is
hidden.
•The O(n) if one student had the pen and only they knew it.
•The O(log n) search if all the students knew, but would only tell me if I guessed
the right side.
Common Asymptotic Notations
Following is a list of some common asymptotic notations −

constant − O(1)

logarithmic − O(log n)

linear − O(n)

n log n − O(n log n)

quadratic − O(n2)

cubic − O(n3)

polynomial − nO(1)

exponential − 2O(n)
Asymptotic analysis of an algorithm refers to defining the mathematical
foundation/framing of its run-time performance.
Using asymptotic analysis, we can very well conclude the best case,
average case, and worst case scenario of an algorithm.

Asymptotic analysis is input bound i.e., if there's no input to the


algorithm, it is concluded to work in a constant time.
Other than the "input" all other factors are considered constant.

Asymptotic analysis refers to computing the running time of any


operation in mathematical units of computation.
For example, the running time of one operation is computed as f(n) and
may be for another operation it is computed as g(n2).
This means the first operation running time will increase linearly with
the increase in n and the running time of the second operation will
increase exponentially when n increases.
Similarly, the running time of both operations will be nearly the same
if n is significantly small.
1) 𝑂-notation( Big O notation ) :-
To denote asymptotic upper bound, we use 𝑂-notation.
For a given function 𝑔(𝑛), we denote by 𝑂(𝑔(𝑛))
(pronounced “big-oh of g of n”) the set of functions:
𝑂(𝑔(𝑛))= { 𝑓(𝑛) : there exist positive constants 𝑐 and 𝑛0 such
that 0≤𝑓(𝑛)≤𝑐∗𝑔(𝑛) for all 𝑛≥𝑛0 }

From the definition: O (g(n)) =


{
f(n) : there exist positive constant c and n0 such that 0 <=
f(n) <= c*g(n)
For all n >= n0
}

n' denotes the upper bound value. If a function is O(n), it is


also O(n2) and O(n3).
It is the most widely used notation
for Asymptotic analysis. It specifies
the upper bound of a function, i.e.,
the maximum time required by an
algorithm or the worst-case time
complexity. In other words, it returns
the highest possible output value
(big-O) for a given input.
2) Big-Omega (Ω) notation
To denote asymptotic lower bound, we use Ω-notation.
For a given function 𝑔(𝑛), we denote by Ω(𝑔(𝑛))
(pronounced “big-omega of g of n”) the set of functions:
Ω(𝑔(𝑛))= { 𝑓(𝑛) : there exist positive constants 𝑐 and 𝑛0 such
that 0≤𝑐∗𝑔(𝑛)≤𝑓(𝑛) for all 𝑛≥𝑛0 }

Big-Omega is an Asymptotic Notation for the best case or a floor growth rate for
a given function. It gives you an asymptotic lower bound on the growth rate of
an algorithm's runtime.

From the definition: The function f( n ) is Ω (g(n)) if there exists a positive


number c and N, such that f(n) >= cg(n) for all n >= N.
3) Θ-notation: Theta Notation
To denote asymptotic tight bound, we use Θ-notation.
For a given function 𝑔(𝑛), we denote by Θ(𝑔(𝑛)) (pronounced
“big-theta of g of n”) the set of functions:
Θ(𝑔(𝑛))= { 𝑓(𝑛) : there exist positive constants 𝑐1,𝑐2 and 𝑛0 such
that 0≤𝑐1∗𝑔(𝑛)≤𝑓(𝑛)≤𝑐2∗𝑔(𝑛) for all 𝑛>𝑛0 }

Big theta defines a function's lower and upper bounds, i.e., it


exists as both, most, and least boundaries for a given input
value.
From the definition : f(n) is Θ(g(n)) if there exists positive
numbers c1, c2 and N such that c1g(n) <= f(n) <= c2g(n) for
all n >= N.
•Best Case − Minimum time required for program execution.
•Best Case: It is defined as the condition that allows an algorithm to
complete statement execution in the shortest amount of time. In this
case, the execution time serves as a lower bound on the algorithm's
time complexity.

•Average Case − Average time required for program execution.


•Average Case: You add the running times for each possible input
combination and take the average in the average case. Here, the
execution time serves as both a lower and upper bound on the
algorithm's time complexity.

•Worst Case − Maximum time required for program execution.

It is defined as the condition that allows an algorithm to complete


statement execution in the shortest amount of time possible. In this
case, the execution time serves as an upper bound on the algorithm's
time complexity.
Example 1 :Find time requirement of the following statements:

1) for (i=0;i<n;i++)
printf(“%d\n”,i);

The output is number from 0 to n-1. The condition (i<n) is


checked n+1 times, n times when it was true and once when it
becomes false.
The 2nd statement printf has been executed n times .
Hence total number of statements executed =n+1+n = 2n+1

2) for (i=0; i<n; i++)


for (j=0; j<n; j++)
x=x*y
Here ,statement x=x*y executes n*n times . Hence the frequency
count is (n2)
3) int a=0;
for (i=0; i<N; i++)
for (j=N; j>i; j--)
a=a+i+j

The above code runs total no of times


=n+(n-1)+(n-2)+……1+0
=n*(n+1)/2
=1/2* n2 +1/2*n
O(n2) times
Example :Find time complexity of nm

Algorithm Frequency Count

1) Read n,m 1
2) Let x=1 1
3) For i=1 to m m+1
4) X = x*n m
5) Print x 1
6) Stop

Total frequency count =2+m+1+m+1


=2m+4 order of magnitude m
Time Complexity – O(n)
Example :Find time complexity of nested for ..loop

Algorithm Frequency Count

1) For i =1 to n do n

2) For j = i+1 to n do n *(n-1)

3) For k=j+1 to n do n *(n-1)*(n-2)

4) X = x+1 n *(n-1)*(n-2)

Time Complexity – O(n3)


Example :Find time complexity of do…. While loop

Algorithm Frequency Count


i =1 1

do{
j++ 20

If( i ==20) 20
break; 1

i++ 20
}while (i <n) 20

Time Complexity – O(1)


Example :Find time complexity of for ..loop

Algorithm Frequency Count

x=n/2 1

for (i=0; i+x<x; i++) n/2+1

{ n/2
k++;

j++; n/2
}

Time Complexity – O(n/2)


What is time complexity ? Compute time complexity for the
following algorithm ?

if(x>y)
{
x=x+1;
}
else
{
for (i=1;i<=N; i++)
{
x=x+1;
}
Frequency
x>y x<=y
if(x>y) 1 1
{ 1 -
x=x+1; - -
} - -
else - -
{ - -
for (i=1;i<=N; i++) - N+1
{ - N
x=x+1; - -
} - -
- -
Total 2 2N+2
Simple Algorithms and its Complexity as Example

Example 1 : Find Space requirement of the following statements

a) Double x[3]
Ans : 24 bytes

b) int max [10] [10] (assume int takes 4 bytes)


Ans :400 bytes

C) a[2] [3] [4]


Ans 96 bytes

You might also like