Unit III
Unit III
• Test:
– Ensure the implementation is as per the requirements.
• Maintenance
– Integrate feedback from users, fix bugs, ensure
compatibility across different versions.
3: Algorithm Analysis
• Examples:
– Evaluating an expression
– Assigning a value to a variable
– Calling a method
– Returning from a method
Pseudocode
High-level description of an algorithm.
Algorithm arrayMax(A, n)
Input array A of n integers
Output maximum element of A
Max A[0]
for i 1 to n 1 do
if A[i] Max then
Max A[i]
return Max
Analysis of insertion sort
The running time of an algorithm on a particular
input is the number of primitive operations or
“steps” executed.
Let us adopt the following view:
A constant amount of time is required to execute
each line of our pseudocode.
The running time of the algorithm is the sum of
running times for each statement executed.
Worst-case
longest running time for an input of size n.
Best-case
smallest running time for an input of size n.
Insertion Sort Algorithm
INSERTION-SORT(A)
1 for j = 2 to A.length
2 key = A[j]
3 // Insert A[j] into the sorted sequence A[1…j-
1]
4 i=j-1
5 while i > 0 and A[i] > key
6 A[i+1] = A[i]
7 i=i-1
8 A[i + 1] = key
Problem: Suppose there are 60 students in the class How will
you calculate the number of absentees in the class?
Algorithmic Approach:
1.Count <- 0, absent <- 0, total <- 60
2.REPEAT till all students counted
Count <- Count + 1
3.absent <- total - Count
4.Print "Number absent is:" , absent
Need of Algorithm
• To analyze algorithms:
– First, we start to count the number of significant
operations in a particular solution to assess its
efficiency.
– Then, we will express the efficiency of algorithms using
growth functions.
32
PERFORMANCE ANALYSIS
Performance Analysis: An algorithm is said to be efficient and fast
if it takes less time to execute and consumes less memory space at run time
is called Performance Analysis.
1. SPACE COMPLEXITY:
The space complexity of an algorithm is the amount of Memory
Space required by an algorithm during the course of execution is called
space complexity. There are three types of space
a) Instruction space : executable program
b) Data space: Required to store all the constant and variable data
space.
c) Environment: It is required to store environment information needed
to resume the suspended space.
2. TIME COMPLEXITY:
The time complexity of an algorithm is the total amount of time
required by an algorithm to complete its execution.
Statement
Steps per Frequency Total
Execution
1 Algorithm 0 - 0
Sum(a,n) 0 -1 0
2 { 1 n+1 1
3 S=0 0; 1 n n+1
4 for I=1 to n do 1 1 n
5 s=s+a[I]; 1 - 1
6 return s; 0 0
7 }
Total 2n+3
The Execution Time of Algorithms
A sequence of operations:
Total Cost = c1 + c2
35
The Execution Time of Algorithms (cont.)
37
The Execution Time of Algorithms (cont.)
Example: Nested Loop
Cost Times
i=1; c1 1
sum = 0; c2 1
while (i <= n) { c3 n+1
j=1; c4 n
while (j <= n) { c5 n*(n+1)
sum = sum + i; c6 n*n
j = j + 1; c7 n*n
}
i = i +1; c8 n
}
Total Cost = c1 + c2 + (n+1)*c3 + n*c4 + n*(n+1)*c5+n*n*c6+n*n*c7+n*c8
The time required for this algorithm is proportional to n2
38
PROOF OF METHODS
2. Indirect proof
The implication P → Q is logically equivalent to the
contrapositive implication ¬ Q → ¬ P .
If n2 is an even integer, then n is an even integer.
Example - direct proof
Proof:
Let p --- “n is odd integer”; q --- “n2 is odd”; we want to show that p
q
Proof strategy:
(a + b ≥ 15) (a ≥ 8) v (b ≥ 8) Note that negation
of conclusion is
(Assume q) Suppose (a < 8) (b < 8). easier to start with
(Show p) Then (a ≤ 7) (b ≤ 7), here.
and (a + b) ≤ 14,
and (a + b) < 15.
QED
3. Proof by contradiction
To prove that the statement P → Q is true using this
method, we start by assuming that P is true but Q is
false. If this assumption leads to a contradiction, it means
that our assumption that “Q is false" must be wrong, and
hence Q must follow from P.
Suppose a ∈ ., If a2 is even, then a is even.
ℤ
Proof by Contradiction
• A – We want to prove p.
• We show that:
(1)¬p F (i.e., a False statement , say r ¬r)
(2)We conclude that ¬p is false since (1) is True and
therefore p is True.
• B – We want to show p q
(1)Assume the negation of the conclusion, i.e., ¬q
(2)Use show that (p ¬q ) F
(3)Since ((p ¬q ) F) (p q) (why?)
49
4. Proof by counterexample
This method provides quick evidence that a
postulated statement is false.
When faced with a problem that requires proving or
disproving a given assertion, we may start by trying
to disprove the assertion with a counter example.
/ n
x x , x n n (m 1) , n n (m
m nm m )m
1 m m
/ n x ,
nm m
Big O Notation
• The Big O notation defines an upper bound of an
algorithm. It bounds a function only from above.
• Suppose for a given function f(n) , g(n) is the Big-O
order when there exist positive constants c and n0
such that 0 <= f(n) <= cg(n) for all n >= n0.
Big Ω Notation
• Just as Big O notation provides an asymptotic upper bound on a
function, Ω notation provides an asymptotic lower bound. It
can be useful when we have lower bound on time complexity of
an algorithm.
• Best case performance of an algorithm is generally not useful,
the Omega notation is the least used notation among all.
• Suppose for a given function f(n) , g(n) is the Big- Ω order when
there exist positive constants c and n0 such that 0 <= cg(n) <=
f(n) for all n >= n0.
Big Θ Notation
• The theta notation bounds a functions from above
and below, so it defines exact asymptotic behavior.
• Suppose for a given function f(n) , g(n) is the Big- Θ
order when there exist positive constants c1 and c2
and n0 such that 0 <= c1*g(n) <= f(n) <= c2*g(n) for
all n >= n0.
• When we use big-Θ notation, we re saying that we
have an asymptotically tight bound on the running
time.
Common time complexities
BETTER
• O(1) constant time
• O(log n) log time
• O(n) linear time
• O(n log n) log linear time
• O(n2) quadratic time
WORSE • O(n3) cubic time
• O(2n) exponential time
63
LITTLE O NOTATION
• Suppose for a given function f(n) , g(n) is the Little- o
order when for any positive constants c there exists
a positive constant n0 such that 0 ≤ f(n) < cg(n) for all
n >= n0.
• Little- o is an upper bound, but is not
an asymptotically tight bound.
• The main difference between O- notation and o-
notation is that in f (n) = O(g(n)), the bound 0<=f
(n)<= cg(n) holds for some constant c > 0, but in f(n)
=o(g(n)), the bound 0 < = f(n) < cg(n) holds for all
constants c > 0
• Example:
– For example, 2n = o(n2), but 2n2 ≠ o(n2).
• Intuitively, in o-notation, the function f (n) becomes
insignificant relative to g(n) as n approaches infinity;
that is,
Little –Omega: ω Notation
Algorithm Ex1(A, n)
Input an array X of n integers
Output the sum of the elements in A
s A[0]
for i 0 to n 1 do
s s + A[i]
return s
81
Exercise: Give a big-Oh characterization
Algorithm Ex2(A, n)
Input an array X of n integers
Output the sum of the elements at even cells in A
s A[0]
for i 2 to n 1 by increments of 2 do
s s + A[i]
return s
82
Exercise: Give a big-Oh characterization
Algorithm Ex1(A, n)
Input an array X of n integers
Output the sum of the prefix sums A
s0
for i 0 to n 1 do
s s + A[0]
for j 1 to i do
s s + A[j]
return s
83
Summary
• Time complexity is a measure of algorithm
efficiency.
• Efficient algorithm plays the major role in
determining the running time.
• Minor tweaks in the code can cut down the
running time by a factor too.
• Other items like CPU speed, memory speed,
device I/O speed can help as well.
• For certain problems, it is possible to allocate
additional space & improve time complexity.
Complexity Θ(n)
ALGORITHM DESIGN TECHNIQUES
Greedy Algorithm
A greedy algorithm is an algorithmic paradigm that follows the
problem solving heuristic of making the locally optimal choice at
each stage with the hope of finding a global optimum.
DIVIDE-AND-CONQUER
In this paradigm a problem is solved recursively, applying three steps at each level of the
recursion
Divide: The problem is divided into a number of sub problems that are smaller instances
of the same problem.
Conquer: The sub problems are conquered by solving them recursively
Combine: The solutions are combined to the sub problems into the solution for the
original problem.
.
ANALYSIS OF DIVIDE-AND-CONQUER
Running time of a divide-and-conquer algorithm is calculated from the
three steps of the basic paradigm.
Let T(n) be the running time on a problem of size n, If the problem
size is small enough, say n for some constant c, the straightforward
solution takes constant time, which we write as Ɵ(1).
Suppose that our division of the problem yields a subproblem, each of
which is 1/b the size of the original. So the time would be a.T(n/b) to
solve a subproblems.
If we take D(n) time to divide the problem into subproblems and C(n)
time to combine the solutions to the subproblems into the solution to
the original problem, then the total time :
.
Binary Search
Binary search applies the divide-and-conquer paradigm.
It follows three-steps divide-and-conquer process for
searching in a typical array A[p…r].
Divide: Partition the array into one subarray either
A1[p…q-1] or A2[q+1…r] depending on the searching
element.
Conquer: recursive calls to binary search on either A1
or A2.
Combine: Because the given element is already
searched or not present in the array, no work is needed
to combine them.
Binary Search Algorithm
BinarySearch(array,low,high,key)//where key is the element to be searched
{
if(low==high)
if(a[low]==key)
return low;
else
return -1;
}
else{
mid=(low+high)/2
if(key==array[mid])
return mid;
else if(key>array[mid])
return BinarySearch(array,mid+1,high,key)
else
return BinarySearch(array,low,mid-1,key)
}
}
Binary Search Algorithm Analysis
Expanding:
T(1) = a (1)
T(n) = T(n / 2) + b (2)
= [T(n / 22) + b] + b = T (n / 22) + 2b by substituting T(n/2) in (2)
= [T(n / 23) + b] + 2b = T(n / 23) + 3b by substituting T(n/22) in (2)
= ……
= T( n / 2k) + kb
2. i p – 1 A: 5 3 2 6 4 1 3 7
3. j r + 1
i
4. while TRUE A[p…q]
j
≤ A[q+1…r]
5. do repeat j j – 1
6. until A[j] ≤ x A: ap ar
7. do repeat i i + 1
8. until A[i] ≥ x j=q i
9. if i < j
10. then exchange A[i] A[j] Each element is
11. else return j visited once!
Running time: (n)
n=r–p+1
101
Recurrence
Alg.: QUICKSORT(A, p, r) Initially: p=1, r=n
if p < r
then q PARTITION(A, p, r)
QUICKSORT (A, p, q)
Recurrence:
T(n) = T(q) + T(n – q) + n
102
Performance of Quicksort
• Average case
– All permutations of the input numbers are equally likely
– On a random input array, we will have a mix of well balanced and
unbalanced splits
– Good and bad splits are randomly distributed across throughout the
tree
partitioning cost:
n combined partitioning cost: n n = (n)
1 n-1 2n-1 = (n)
(n – 1)/2 + 1 (n – 1)/2
(n – 1)/2 (n – 1)/2
n/2 n/2
2 2 2
MERGE-SORT Running Time
• Divide:
– compute q as the average of p and r: D(n) = (1)
• Conquer:
– recursively solve 2 subproblems, each of size n/2
2T (n/2)
• Combine:
– MERGE on an n-element subarray takes (n) time
C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1
109
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1
Use Master’s Theorem:
110
Merge-Sort Time Complexity
If the time for the merging operation is proportional to n, then the
computing time for merge sort is described by the recurrence relation
c1 n=1, c1 is a constant
T(n) =
2T(n/2) + c2n n>1, c2 is a constant
…..
…..
=2k T(1)+ kc2n
= c1n+c2nlogn = = O(nlogn)
Summary
• Merge-Sort
– Most of the work was done in combining the
solutions.
– Best case takes o(n log(n)) time
– Average case takes o(n log(n)) time
– Worst case takes o(n log(n)) time
• Advantages of Divide and Conquer Algorithms
• 1.**Efficiency:** Divide and conquer can lead to efficient algorithms for solving
complex problems. By breaking down the problem into smaller parts, each part can
be solved independently and potentially in parallel, reducing the overall time
complexity.
• 2. **Simplicity:** The approach simplifies complex problems by breaking them into
smaller, well-defined sub-problems. This can make the problem-solving process
more manageable and easier to understand.
• 3. **Modularity:** Divide and conquer promotes modular design. Each sub-problem
can be solved independently, making the codebase more organized and easier to
maintain.
• 4. **Reusability:** The sub-problems created during the division phase can often be
reused in different contexts or for solving similar problems, leading to code
reusability.
• 5. **Optimization Opportunities:** Optimization techniques can be applied to
individual sub-problems, improving the efficiency of the solution overall.
• Disadvantages of Divide and Conquer Algorithm
• 1. **Overhead:** The division and combination phases of the approach may introduce some
overhead due to the need for additional calculations and merging of results.
• 2. **Complexity of Implementation:** In some cases, implementing the divide and conquer
approach might be more complex than using simpler algorithms. This complexity can lead to
errors if not implemented correctly.
• 3. **Memory Usage:** Divide and conquer algorithms may require additional memory for storing
intermediate results or dividing the problem into sub-problems. This can be a concern for
problems with large input sizes.
• 4. **Suboptimal Solutions:** In some cases, the division of the problem may not lead to the
most optimal solution. Poorly chosen divisions or a mismatch between sub-problems can lead
to suboptimal results.
• 5. **Recursion Overhead:** Many divide and conquer algorithms are implemented using
recursion, which can introduce overhead and potentially lead to stack overflow issues for very
deep recursion levels.
BRUTE FORCE
Brute force - the simplest of the design strategies
is a straightforward approach to solving a problem,
usually directly based on the problem’s statement and
definitions of the concepts involved.
the brute-force strategy is the easiest to apply.
Brute force is important due to its wide applicability and
simplicity.
Weakness is the subpar efficiency of most brute-force
algorithms.
Important Examples:
Selection sort, Brute-force string matching, Convex hull
problem
Exhaustive search: Traveling salesman, Knapsack, and
Assignment problems
Brute Force
A straightforward approach, usually based directly on the
problem’s statement and definitions of the concepts
involved
Examples – based directly on definitions:
1. Computing an (a > 0, n a nonnegative integer)
2. Computing n!
3. Multiplying two matrices
4. Searching for a key of a given value in a list
116
EXHAUSTIVE SEARCH
Many Brute Force Algorithms use Exhaustive
Search
Approach:
1. Enumerate and evaluate all solutions, and
2. Choose the solution that meets some criteria (eg smallest)
Exhaustive Search – More Detail
A brute force solution to a problem involving search for an element
with a special property, usually among combinatorial objects such
as permutations, combinations, or subsets of a set.
Method:
– generate a list of all potential solutions to the problem in a
systematic manner
– evaluate potential solutions one by one, disqualifying
infeasible ones and, for an optimization problem, keeping track
of the best one found so far
118
EXHAUSTIVE SEARCH
Examples:
Traveling salesman problem
Finding the shortest tour through a given set of n cities
that visits each city exactly once before returning to the
city where it started.
Knapsack problem
Finding the most valuable list of out-of n items that fit
into the knapsack.
Assignment problem
Finding an assignment of n people to execute n jobs
with the smallest total cost.
TRAVELLING SALESMAN PROBLEM
The traveling salesman problem, (also know as TSP), is
one of the most interesting and difficult problems in
applied statistics.
It is the problem of finding shortest path available to make
a tour of number of cities such that visiting each city
exactly once and only once and return to the original
starting point.
Some of the solution methods of TSP include:
Brute-force method.
Approximations
• Nearest neighbor
• Greedy approach
Branch and bound.
AS A GRAPH PROBLEM
• Given n cities with known distances between each pair, find the shortest
tour that passes through all the cities exactly once before returning to the
starting city
• More formally: Find shortest Hamiltonian circuit in a weighted connected
graph
• Example:
2
a b
5 3
8 4
c d
7
125
TSP by Exhaustive Search
Tour Cost
a→b→c→d→a 2+3+7+5 = 17
a→b→d→c→a 2+4+7+8 = 21
a→c→b→d→a 8+3+4+5 = 20
a→c→d→b→a 8+7+4+2 = 21
a→d→b→c→a 5+4+3+8 = 20
a→d→c→b→a 5+7+3+2 = 17
Have we considered all tours?
Do we need to consider more?
Any way to consider fewer?
Efficiency: Number of tours = number of …
126
TSP by Exhaustive Search
Tour Cost
a→b→c→d→a 2+3+7+5 = 17
a→b→d→c→a 2+4+7+8 = 21
a→c→b→d→a 8+3+4+5 = 20
a→c→d→b→a 8+7+4+2 = 21
a→d→b→c→a 5+4+3+8 = 20
a→d→c→b→a 5+7+3+2 = 17
Have we considered all tours? Start elsewhere: b-c-d-a-b
Do we need to consider more? No
Any way to consider fewer? Yes: Reverse
Efficiency: # tours = O(# permutations of b,c,d) = O(n!)
127
PSEUDO CODE OF TSP
𝐵𝑖𝑋𝑖
𝑖=1
Subject to constraint
∑𝑛𝑖=1 𝑉𝑖𝑋𝑖 <=V
And
0 <=𝑋𝑖<= Q𝑖.
BRUTE FORCE SOLUTION OF KNAPSACK PROBLEM
Example
• Disadvantages:
• Can be extremely slow for large datasets.
• Exponential time complexity can lead to impractical
runtimes.
• Not suitable for real-time applications.
• Requires large memory.
• May require significant computational resources.
. 141