BMC205 DSAA Unit1 Intro Notes
BMC205 DSAA Unit1 Intro Notes
Entity
An entity is an existing or real thing. An entity is a distinguishable real-world object that
exists. An attribute describes the elementary feature of an entity. In the relational data model,
an entity is represented as a record in an entity set and a field represents an attribute. An entity
can be of two types: (1) Tangible Entities are those entities which exist in the real world
physically. Example: Person, car, etc. (2) Intangible Entities are those entities which exist only
logically and have no physical existence. Example: Bank Account, etc.
2 Data are text and numerical values. Information is refined form of actual data.
Bits and Bytes are the measuring unit of Information is measured in meaningful
4
data. units like time, quantity, etc.
Data can be easily structured as the Information can also be structured as the
5 following: 1.Tabular data 2.Graph 3.Data following: 1. Language 2. Ideas 3.
tree Thoughts
Data does not directly helps in decision Information directly helps in decision
8
making. making.
Data Type
A data type is a classification of data which tells the compiler or interpreter how the programmer
intends to use the data. Most programming languages support various types of data, including
integer, real, character or string and Boolean. Our program can deal with predefined data types or
user-defined types. It is of two types:
Primitive Data Type / Build-In Data Type
Built-in Data types are those data types that are pre-defined by the programming
language. These data types can be used directly in the program without the hassle of
creating them. Every programming language has its own specific set of built-in data
types. They are also called Primary or Primitive Data Types. Int, char, float, double,
Boolean, void – are the most basic and common built-in data types in almost every
programming language.
Non-primitive Data Type
User-defined data type that allows storing values of same/different data types within one
entity. Examples of non-primitive data structure are Array, Linked list, stack
Data Structures
A data structure is a storage that is used to store and organize data. It is a way of arranging
data on a computer so that it can be accessed and updated efficiently. A data structure is not
only used for organizing the data. It is also used for processing, retrieving, and storing data.
Data structure is a logical or mathematical model of a particular organization of data.
The choice of a particular data structure depends on the requirement consideration. It must be
able to represent the inherent relationship of the data in the real world. It must be simple enough
so that it can process efficiently as and when necessary.
There are several basic and advanced types of data structures, all designed to arrange
data to suit a specific purpose. It is not only important to use data structures, but it is also
important to choose the proper data structure for each task. Choosing an ill-suited data structure
could result in slow runtimes or unresponsive code.
The different models used to organize data in the main memory are collectively referred
to as data structures, whereas the different models used to organize data in the secondary
memory are collectively referred to as file structure.
Pseudocode - Pseudocode refers to a method using which we represent any algorithm for any
program. Pseudocode does not consist of any specific syntax like a programming language.
Thus, one cannot execute it on a computer. We can use several formats for writing pseudocode.
A majority of them take down the structures from the available languages, like FORTRAN, Lisp,
C, etc.
Various times, we can present an algorithm using the pseudocode because any programmer can
easily read as well as understand them (those who are familiar with multiple programming
languages). You get to include various control structures using pseudocode, such as repeat-until,
if-then-else, while, for and case. It is present in various languages that are high-level.
Note– A pseudocode is not equivalent to a real programming language.
Correctness: An algorithm's correctness is defined as when the given inputs produce the
desired output, indicating that the algorithm was designed correctly. An algorithm's analysis
has been completed correctly.
Functionality: It takes into account various logical steps to solve a real-world problem.
Robustness: Robustness refers to an algorithm's ability to define your problem clearly.
User-friendly: If the algorithm is difficult to understand, the designer will not explain it to
the programmer.
Properties of algorithm
Input: An algorithm requires some input values. An algorithm can be given a value other
than 0 as input.
Output: At the end of an algorithm, you will have one or more outcomes.
Finiteness: An algorithm must be finite. Finiteness in this context means that the algorithm
should have a limited number of instructions, i.e., the instructions should be countable.
Effectiveness: Because each instruction in an algorithm affects the overall process, it should
be adequate.
Greedy Algorithm
In the greedy approach, at each step, a decision is made to choose the local optimum, without
thinking about the future consequences. In each phase the currently best decision is made.
A greedy algorithm is very easy to apply to complex problems. It decides which step will
provide the most accurate solution in the next step.
This algorithm is a called greedy because when the optimal solution to the smaller instance is
provided, the algorithm does not consider the total program as a whole.
It is simple to set up and has a shorter execution time. However, there are very few cases where
it is the best solution.
Examples: - Dijkstra's algorithm (shortest path in weighted graphs), Prim's algorithm, Kruskal's
algorithm (minimal spanning tree in weighted graphs), Huffman Trees.
Dynamic Programming
Bottom-Up Technique in which the smallest sub-instances are explicitly solved first and the
results of these used to construct solutions to progressively larger sub-instances.
It improves the efficiency of the algorithm by storing intermediate results. It goes through five
steps to find the best solution to the problem:
1. It divides the problem into subproblems to find the best solution.
2. After breaking down the problem into subproblems, it finds the best solution from these
subproblems.
3. Memorization is the process of storing the results of subproblems.
4. Reuse the result to prevent it from being recomputed for the same subproblems.
5. Finally, it computes the complex program's output.
Unlike divide and conquer method, dynamic programming reuses the solution to the sub-
problems many times.
Example: Fibonacci Series computed by iteration.
Note: - The difference is that whenever we have recursive function calls with the same result,
instead of calling them again we try to store the result in a data structure in the form of a table
and retrieve the results from the table. Thus, the overall time complexity is reduced. “Dynamic”
means we dynamically decide whether to call a function or retrieve values from the table.
Branch and Bound Algorithm
Only integer programming problems can be solved using the branch and bound algorithm. This
method divides all feasible solution sets into smaller subsets. These subsets are then evaluated
further to find the best solution.
Example: Job sequencing, Travelling salesman problem.
Backtracking
Backtracking Algorithm tries each possibility until they find the right one. It is a depth-first
search of the set of possible solution. During the search, if an alternative doesn't work, then
backtrack to the choice point, the place which presented different alternatives, and tries the next
alternative.
Example: N-queen problem
Linear Programming
In Linear Programming, there are inequalities in terms of inputs and maximizing or minimizing
some linear functions of inputs.
Time Complexity
The amount of time required to complete an algorithm's execution is called time complexity. The
big O notation is used to represent an algorithm's time complexity. The asymptotic notation for
describing time complexity, in this case, is big O notation. The time complexity is calculated
primarily by counting the number of steps (key operations or comparisons) required to complete
the execution. Let us look at an example of time complexity.
mul = 1;
// Suppose you have to calculate the multiplication of n numbers.
for i=1 to n
mul = mul * i;
// when the loop ends, then mul holds the multiplication of the n numbers
return mul;
The time complexity of the loop statement in the preceding code is at least n, and as the value of
n escalates, so does the time complexity. While the code's complexity, i.e., returns mul, will be
constant because its value is not dependent on the importance of n and will provide the result in a
single step. The worst-time complexity is generally considered because it is the maximum time
required for any given input size.
There are three cases of time complexity:
Worst Case
Average Case
Best Case
= (1 + 2 + …… + n).1/n
Space Complexity
The amount of space an algorithm requires to solve a problem and produce an output is called its
space complexity. Space complexity, like time complexity, is expressed in big O notation.
The space is required for an algorithm for the following reasons:
1. To store program instructions.
2. To store track of constant values.
3. To store track of variable values.
4. To store track of function calls, jumping statements, and so on.
Space Complexity = Auxiliary Space + Input Size
O(1) constant: the operation doesn't depend on the size of its input, e.g. adding a node to the tail
of a linked list where we always maintain a pointer to the tail node.
O(log n) logarithmic: normally associated with algorithms that break the problem into smaller
chunks per each invocation, e.g. searching a binary search tree.
O(n log n) just n log n: usually associated with an algorithm that breaks the problem into smaller
chunks per each invocation, and then takes the results of these smaller chunks and stitches them
back together, e.g. quick sort.
There are mainly three asymptotic notations for the complexity analysis of algorithms.
Big-Oh Notation (O)
Big-Omega Notation (Ω)
Big-Theta Notation (Θ)
The Big-Omega Notation is the formal way to express the lower bound of an algorithm's
running time.
It measures the best case time complexity or the shortest amount of time an algorithm can
possibly take to complete.
Let f be a non-negative function. f(n) is said to be Big-Omega of g(n), written as:
f(n) = Ω(g(n)), iff there are positive constants c and n0 such that
0 ≤ c*g(n) ≤ f(n) for all n ≥ n0
If f(n) = Ω(g(n)), we say that g(n) is a lower bound on f(n).
Theta Notation (Θ)
The Theta Notation is the formal way to express the lower and upper bound of an algorithm's
running time.
Let f be a non-negative function. We say that f(n) is Theta of g(n), written as:
f(n) = Θ(g(n)), iff there are positive constants c1, c2 and n0 such that
0 ≤ c1*g(n) ≤ f(n) ≤ c2*g(n) for all n ≥ n0
Equivalently, f(n) = Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n)).