0% found this document useful (0 votes)
93 views14 pages

BMC205 DSAA Unit1 Intro Notes

The document provides an introduction to data structures and algorithms, defining key concepts such as data, information, entities, and data types. It explains the differences between data and information, types of data structures, basic and special operations, and various algorithm design techniques. Additionally, it discusses the importance of algorithm analysis, including time and space complexity, and highlights different algorithm design methods like brute force, divide and conquer, and dynamic programming.

Uploaded by

Shivam Rathore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views14 pages

BMC205 DSAA Unit1 Intro Notes

The document provides an introduction to data structures and algorithms, defining key concepts such as data, information, entities, and data types. It explains the differences between data and information, types of data structures, basic and special operations, and various algorithm design techniques. Additionally, it discusses the importance of algorithm analysis, including time and space complexity, and highlights different algorithm design methods like brute force, divide and conquer, and dynamic programming.

Uploaded by

Shivam Rathore
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Hindustan Institute of Management and Computer Studies

Data Structures & Analysis of Algorithms (BMC205)


Unit -1
(Introduction to Data Structures)

Data and Information


Data is a raw and unorganized fact that is required to be processed to make it meaningful. It
can be considered as facts and statistics collected together for reference or analysis. Data is
meaningless. Data contains numbers, statements, and characters in a raw form. When data is
processed, organized, structured or presented in a given context so as to make it useful, it is
called information.

Entity
An entity is an existing or real thing. An entity is a distinguishable real-world object that
exists. An attribute describes the elementary feature of an entity. In the relational data model,
an entity is represented as a record in an entity set and a field represents an attribute. An entity
can be of two types: (1) Tangible Entities are those entities which exist in the real world
physically. Example: Person, car, etc. (2) Intangible Entities are those entities which exist only
logically and have no physical existence. Example: Bank Account, etc.

Difference between Information and Data:


S.NO DATA INFORMATION

Data are the variables that help to develop


1 Information is meaningful data.
ideas/conclusions.

2 Data are text and numerical values. Information is refined form of actual data.

3 Data doesn’t rely on Information. While Information relies on Data.

Bits and Bytes are the measuring unit of Information is measured in meaningful
4
data. units like time, quantity, etc.

Data can be easily structured as the Information can also be structured as the
5 following: 1.Tabular data 2.Graph 3.Data following: 1. Language 2. Ideas 3.
tree Thoughts

Information carries a meaning that has


6 Data does not have any specific purpose
been assigned by interpreting data.

7 It is low-level knowledge. It is the second level of knowledge.


S.NO DATA INFORMATION

Data does not directly helps in decision Information directly helps in decision
8
making. making.

Data is collection of facts, which itself


9 Information puts those facts into context.
have no meaning.

Example of information is average score


10 Example of data is student test score.
of class that is derived from given data.

Data Type
A data type is a classification of data which tells the compiler or interpreter how the programmer
intends to use the data. Most programming languages support various types of data, including
integer, real, character or string and Boolean. Our program can deal with predefined data types or
user-defined types. It is of two types:
 Primitive Data Type / Build-In Data Type
Built-in Data types are those data types that are pre-defined by the programming
language. These data types can be used directly in the program without the hassle of
creating them. Every programming language has its own specific set of built-in data
types. They are also called Primary or Primitive Data Types. Int, char, float, double,
Boolean, void – are the most basic and common built-in data types in almost every
programming language.
 Non-primitive Data Type
User-defined data type that allows storing values of same/different data types within one
entity. Examples of non-primitive data structure are Array, Linked list, stack

Abstract Data Type (ADT)


Abstract data types are the entities that are definitions of data and operations but do not have
implementation details. In this case, we know the data that we are storing and the operations that
can be performed on the data, but we don't know about the implementation details.
It does not specify how data will be organized in memory and what algorithms will be used for
implementing the operations. It is called “abstract” because it gives an implementation-
independent view. The process of providing only the essentials and hiding the details is known as
abstraction. Think of ADT as a black box which hides the inner structure and design of the data
type. The ADT is made of with primitive data types, but operation logics are hidden. Some
examples of ADT are Stack, Queue, Linked List etc.

Data Structures
A data structure is a storage that is used to store and organize data. It is a way of arranging
data on a computer so that it can be accessed and updated efficiently. A data structure is not
only used for organizing the data. It is also used for processing, retrieving, and storing data.
Data structure is a logical or mathematical model of a particular organization of data.
The choice of a particular data structure depends on the requirement consideration. It must be
able to represent the inherent relationship of the data in the real world. It must be simple enough
so that it can process efficiently as and when necessary.
There are several basic and advanced types of data structures, all designed to arrange
data to suit a specific purpose. It is not only important to use data structures, but it is also
important to choose the proper data structure for each task. Choosing an ill-suited data structure
could result in slow runtimes or unresponsive code.
The different models used to organize data in the main memory are collectively referred
to as data structures, whereas the different models used to organize data in the secondary
memory are collectively referred to as file structure.

Types of Data Structures- (based on traversing)


 Linear data structure: Data structure, in which data elements are arranged sequentially or
linearly and where each element is attached to its previous and next adjacent elements, is
called a linear data structure.
Examples: array, stack, queue, linked list etc.
 Non-linear data structure: Data structures, where data elements are not placed
sequentially or linearly, are called non-linear data structures. In a non-linear data structure,
we can’t traverse all the elements in a single run only.
Examples: trees and graphs.

Types of Data Structures (based on size)


 Static data structure: Static data structure has a fixed memory size. It is easier to access
the elements in a static data structure.
Example: array.
 Dynamic data structure: In dynamic data structure, the size is not fixed. It can be
randomly updated during the runtime which may be considered efficient concerning the
memory (space) complexity of the code.
Examples: queue, stack, linked list etc.

Basic Operations of Data Structure


 Creating
 Traversing
 Inserting
 Deleting
 Searching
 Sorting
 Updating
 Counting

Special Operations of Data Structure


 Merging – Combining two or more sorted lists into a single sorted list
 Copying – Creating a duplicate of a list
 Concatenation – Combining two or more lists into a single larger list in a sequential
order.
 Spliting

Intro and Definition of Algorithm


An algorithm is a finite set of instructions carried out in a specific order to perform a particular
task. Algorithms are generally developed independently of underlying languages, which means
that an algorithm can be implemented in more than one programming language. The scalability
and performance of an algorithm are the primary factors that contribute to its importance. When
you have a sizable real-world problem, you must break it down into small steps to analyze it
quickly. The real world is challenging to break down into smaller steps. If a problem can be
easily divided into smaller steps, it indicates that the problem is feasible.

Difference between algorithm and program


Algorithm - We use an algorithm to generate a solution to any given problem in the form of
steps. When we use a computer for solving any given problem (in well-defined multiple steps),
we need to communicate these steps to the solution properly to the computer.
When we execute any algorithm on a computer, we need to combine various operations like
subtraction and addition for performing various complex operations of mathematics. We can
express algorithms using flowcharts, natural language, and many more.

Pseudocode - Pseudocode refers to a method using which we represent any algorithm for any
program. Pseudocode does not consist of any specific syntax like a programming language.
Thus, one cannot execute it on a computer. We can use several formats for writing pseudocode.
A majority of them take down the structures from the available languages, like FORTRAN, Lisp,
C, etc.
Various times, we can present an algorithm using the pseudocode because any programmer can
easily read as well as understand them (those who are familiar with multiple programming
languages). You get to include various control structures using pseudocode, such as repeat-until,
if-then-else, while, for and case. It is present in various languages that are high-level.
Note– A pseudocode is not equivalent to a real programming language.

Program - It refers to a set of various instructions that a computer follows. No machine is


actually capable of reading a program directly. It is because it only understands the available
machine code. Instead, you can write anything in a computer language and then use a compiler
or interpreter (for compiling and interpreting) so that it becomes understandable for any
computer system.

Factors of an Algorithm during designing an algorithm


 Modularity: This feature was perfectly designed for the algorithm if you are given a problem
and break it down into small-small modules or small-small steps, which is a basic definition
of an algorithm.

 Correctness: An algorithm's correctness is defined as when the given inputs produce the
desired output, indicating that the algorithm was designed correctly. An algorithm's analysis
has been completed correctly.

 Maintainability: It means that the algorithm should be designed in a straightforward,


structured way so that when you redefine the algorithm, no significant changes are made to
the algorithm.

 Functionality: It takes into account various logical steps to solve a real-world problem.
 Robustness: Robustness refers to an algorithm's ability to define your problem clearly.

 User-friendly: If the algorithm is difficult to understand, the designer will not explain it to
the programmer.

 Simplicity: If an algorithm is simple, it is simple to understand.

 Extensibility: Your algorithm should be extensible if another algorithm designer or


programmer wants to use it.

Properties of algorithm
 Input: An algorithm requires some input values. An algorithm can be given a value other
than 0 as input.

 Output: At the end of an algorithm, you will have one or more outcomes.

 Unambiguity: A perfect algorithm is defined as unambiguous, which means that its


instructions should be clear and straightforward.

 Finiteness: An algorithm must be finite. Finiteness in this context means that the algorithm
should have a limited number of instructions, i.e., the instructions should be countable.

 Effectiveness: Because each instruction in an algorithm affects the overall process, it should
be adequate.

 Language independence: An algorithm must be language-independent, which means that its


instructions can be implemented in any language and produce the same results.

Algorithm Design Techniques


Brute Force
Divide and conquer
Greedy
Dynamic Programming
Branch & Bound
Backtracking
Linear Programming

 Brute Force Algorithm


It is based on the problem’s statement and definitions of the concepts involved. This algorithm
uses the general logic structure to design an algorithm.
It is also called an exhaustive search algorithm because it exhausts all possibilities to provide the
required solution.
There are two kinds of such algorithms:
1. Optimizing: Finding all possible solutions to a problem and then selecting the best one, will
terminate if the best solution is known.
2. Sacrificing: It will stop as soon as the best solution is found.
Examples: Linear search, Selection sort, Bubble sort
 Divide and Conquer
In the divide and conquer approach, the problem is divided into several small sub-problems.
Then the sub-problems are solved recursively and combined to get the solution of the original
problem.
The divide and conquer approach involves the following steps at each level:
Divide − The original problem is divided into subproblems.
Conquer − The sub-problems are solved recursively.
Combine − The solutions of the sub-problems are combined together to get the solution of the
original problem.
Examples: - Binary search, Quick sort, Merge sort, Tree traversal, Strassen’s Matrix
Multiplication etc.

 Greedy Algorithm
In the greedy approach, at each step, a decision is made to choose the local optimum, without
thinking about the future consequences. In each phase the currently best decision is made.
A greedy algorithm is very easy to apply to complex problems. It decides which step will
provide the most accurate solution in the next step.
This algorithm is a called greedy because when the optimal solution to the smaller instance is
provided, the algorithm does not consider the total program as a whole.
It is simple to set up and has a shorter execution time. However, there are very few cases where
it is the best solution.
Examples: - Dijkstra's algorithm (shortest path in weighted graphs), Prim's algorithm, Kruskal's
algorithm (minimal spanning tree in weighted graphs), Huffman Trees.

 Dynamic Programming
Bottom-Up Technique in which the smallest sub-instances are explicitly solved first and the
results of these used to construct solutions to progressively larger sub-instances.
It improves the efficiency of the algorithm by storing intermediate results. It goes through five
steps to find the best solution to the problem:
1. It divides the problem into subproblems to find the best solution.
2. After breaking down the problem into subproblems, it finds the best solution from these
subproblems.
3. Memorization is the process of storing the results of subproblems.
4. Reuse the result to prevent it from being recomputed for the same subproblems.
5. Finally, it computes the complex program's output.
Unlike divide and conquer method, dynamic programming reuses the solution to the sub-
problems many times.
Example: Fibonacci Series computed by iteration.

Note: - The difference is that whenever we have recursive function calls with the same result,
instead of calling them again we try to store the result in a data structure in the form of a table
and retrieve the results from the table. Thus, the overall time complexity is reduced. “Dynamic”
means we dynamically decide whether to call a function or retrieve values from the table.
 Branch and Bound Algorithm
Only integer programming problems can be solved using the branch and bound algorithm. This
method divides all feasible solution sets into smaller subsets. These subsets are then evaluated
further to find the best solution.
Example: Job sequencing, Travelling salesman problem.

 Backtracking
Backtracking Algorithm tries each possibility until they find the right one. It is a depth-first
search of the set of possible solution. During the search, if an alternative doesn't work, then
backtrack to the choice point, the place which presented different alternatives, and tries the next
alternative.
Example: N-queen problem

 Linear Programming
In Linear Programming, there are inequalities in terms of inputs and maximizing or minimizing
some linear functions of inputs.

The Analysis / Complexity of an Algorithm


Each of our algorithms will involve a particular data structure, depends on the type of data & the
frequency with which various data operations are applied. Complexity of algorithm is a function
of size of input of a given problem instance which determines how much running time/memory
space is needed by the algorithm in order to run to completion. The algorithm's performance can
be measured in two ways:

Time Complexity
The amount of time required to complete an algorithm's execution is called time complexity. The
big O notation is used to represent an algorithm's time complexity. The asymptotic notation for
describing time complexity, in this case, is big O notation. The time complexity is calculated
primarily by counting the number of steps (key operations or comparisons) required to complete
the execution. Let us look at an example of time complexity.

mul = 1;
// Suppose you have to calculate the multiplication of n numbers.
for i=1 to n
mul = mul * i;
// when the loop ends, then mul holds the multiplication of the n numbers
return mul;

The time complexity of the loop statement in the preceding code is at least n, and as the value of
n escalates, so does the time complexity. While the code's complexity, i.e., returns mul, will be
constant because its value is not dependent on the importance of n and will provide the result in a
single step. The worst-time complexity is generally considered because it is the maximum time
required for any given input size.
There are three cases of time complexity:
Worst Case
Average Case
Best Case

Example – Linear Search


Worst Case – C(n) = n

where C(n) is the number of comparisons & n is the input size

Average Case – C(n) = 1.1/n + 2.1/n + …….. + n.1/n

= (1 + 2 + …… + n).1/n

= n.(n+1)/2 . 1/n = (n + 1)/2 i.e. n/2

where each number 1 to n occur with probability p = 1/n

Best Case – C(n) = 1

Space Complexity
The amount of space an algorithm requires to solve a problem and produce an output is called its
space complexity. Space complexity, like time complexity, is expressed in big O notation.
The space is required for an algorithm for the following reasons:
1. To store program instructions.
2. To store track of constant values.
3. To store track of variable values.
4. To store track of function calls, jumping statements, and so on.
Space Complexity = Auxiliary Space + Input Size

Time Space Trade-off


The best algorithm to solve a given problem is one that requires less memory space and less
time to run to completion. But in practice, it is not always possible to obtain both of these
objectives.
One algorithm may require less memory space but may take more time to complete its
execution. On the other hand, the other algorithm may require more memory space but may take
less time to run to completion.
Thus, we have to sacrifice one at the cost of other. In other words, there is Space-Time trade-
off between algorithms i.e. by increasing the space, one can reduce the time or vice-versa.
Complexity of Various Code Structures
Order of Growth
An order of growth is a set of functions whose asymptotic growth behavior is considered
equivalent.
Example:- 2n, 100n and n+1 belong to the same order of growth, which is written O(n) in Big-
Oh notation and often called linear because every function in the set grows linearly with n.
All functions with the leading term n2 belong to O(n2); they are quadratic, which is a fancy
word for functions with the leading term n2.
The following table shows some of the orders of growth that appear most commonly in
algorithmic analysis, in increasing order of badness.
The following list explains some of the most common big Oh notations:

O(1) constant: the operation doesn't depend on the size of its input, e.g. adding a node to the tail
of a linked list where we always maintain a pointer to the tail node.

O(n) linear: the run time complexity is proportionate to the size of n.

O(log n) logarithmic: normally associated with algorithms that break the problem into smaller
chunks per each invocation, e.g. searching a binary search tree.

O(n log n) just n log n: usually associated with an algorithm that breaks the problem into smaller
chunks per each invocation, and then takes the results of these smaller chunks and stitches them
back together, e.g. quick sort.

O(n2) quadratic: e.g. bubble sort.

O(n3) cubic: very rare. e.g. matrix multiplication.

O(2n) exponential: incredibly rare. e.g. generating all subsets of a set.


Note: - An algorithm with a quadratic run time grows faster than one with a logarithmic run
time.

Type Notation Example Algorithms

Logarithmic O(log n) Binary Search

Linear O(n) Linear Search

Superlinear O(n log n) Heap Sort, Merge Sort

Strassen’s Matrix Multiplication, Bubble Sort,


Polynomial O(n^c)
Selection Sort, Insertion Sort, Bucket Sort

Exponential O(c^n) Tower of Hanoi

Traveling Salesman Problem, Permutations of an


Factorial O(n!)
array
The study of the variations in the performance of the algorithm with the change in the order of
the input size is called Asymptotic Analysis.
Asymptotic notations are mathematical notations to describe the running time of an algorithm
when the input tends towards a particular value.

There are mainly three asymptotic notations for the complexity analysis of algorithms.
Big-Oh Notation (O)
Big-Omega Notation (Ω)
Big-Theta Notation (Θ)

Big-Oh Notation (O)


The Big-Oh Notation is the formal way to express the upper bound of an algorithm's running
time.
It measures the worst case time complexity or the longest amount of time an algorithm can
possibly take to complete.
Let f be a non-negative function. We say that f(n) is Big-O of g(n), written as:
f(n) = O(g(n)), iff there are positive constants c and n0 such that
0 ≤ f(n) ≤ c*g(n) for all n ≥ n0
If f(n) = O(g(n)), we say that g(n) is an upper bound on f(n)

Big-Omega Notation (Ω)

The Big-Omega Notation is the formal way to express the lower bound of an algorithm's
running time.
It measures the best case time complexity or the shortest amount of time an algorithm can
possibly take to complete.
Let f be a non-negative function. f(n) is said to be Big-Omega of g(n), written as:
f(n) = Ω(g(n)), iff there are positive constants c and n0 such that
0 ≤ c*g(n) ≤ f(n) for all n ≥ n0
If f(n) = Ω(g(n)), we say that g(n) is a lower bound on f(n).
Theta Notation (Θ)
The Theta Notation is the formal way to express the lower and upper bound of an algorithm's
running time.
Let f be a non-negative function. We say that f(n) is Theta of g(n), written as:
f(n) = Θ(g(n)), iff there are positive constants c1, c2 and n0 such that
0 ≤ c1*g(n) ≤ f(n) ≤ c2*g(n) for all n ≥ n0
Equivalently, f(n) = Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)).

Steps to Determine Big O Notation:


1. Identify the Dominant Term:
 Examine the function and identify the term with the highest order of growth as the input
size increases.
 Ignore any constant factors or lower-order terms.
2. Determine the Order of Growth:
 The order of growth of the dominant term determines the Big O notation.
3. Write the Big O Notation:
 The Big O notation is written as O(f(n)), where f(n) represents the dominant term.
 For example, if the dominant term is n^2, the Big O notation would be O(n^2).
4. Simplify the Notation (Optional):
 In some cases, the Big O notation can be simplified by removing constant factors or by
using a more concise notation.
 For instance, O(2n) can be simplified to O(n).
Example:
Function: f(n) = 3n3 + 2n2 + 5n + 1
1. Dominant Term: 3n3
2. Order of Growth: Cubic (n3)
3. Big O Notation: O(n3)
4. Simplified Notation: O(n 3)

You might also like