Da Algo Snotes
Da Algo Snotes
int fun(n){
if(n<=1)return 1;
else return fun(n/2) + n;
}
Similarly,
ity
1.1.1) Iteration method or Repeated substitution method :
𝑛
𝑇 ( ) + 𝑛, 𝑛 > 1
Example 2 : 𝑇(𝑛) = { 2
1, 𝑛 ≤ 1
ua
Q
Now, here where to stop so we stop at in worst case ? longest height leaf. Here sequence (4/5)k n will
produce longest path and at leaves T(1) should appear.
But 𝑛 log 5/4 𝑛 is upper bound, lower bound can be found by taking shortest path of the tree, which is
formed by n/5k sequence. So, 𝑛 log 5 𝑛 ≤ 𝑇(𝑛) ≤ 𝑛 log 5/4 𝑛 ➔ 𝑛 log 2 𝑛 ≤ 𝑇(𝑛) ≤ 𝑛 log 2 𝑛 as
log2 𝑛
log 5/4 𝑛 = 5 = log 2 𝑛. Therefore, 𝑇(𝑛) = 𝜃(𝑛𝑙𝑔𝑛)
log2
4
𝑛 2𝑛
If 𝑇(𝑛) = 𝑇 (3 ) + 𝑇 ( 3 ) + 𝑛2 is given then
𝒏 𝜽(𝟏), 𝒊𝒇 𝒄 < 𝟏
ity
𝟏 + 𝒄 + 𝒄𝟐 + ⋯ + 𝒄𝒏 = ∑ 𝒄𝒊 = {𝜽(𝒏), 𝒊𝒇 𝒄 = 𝟏
𝒊=𝟏 C 𝜽(𝒄𝒏 ), 𝒊𝒇 𝒄 > 𝟏
NOTE :
1) If you are using tree method then always find lower and upper bound and then conclude
um
𝒏
𝜽(𝒇(𝒏)). 𝑻(𝒏) = 𝑻(𝒏 − 𝟏) + 𝑻(𝒏 − 𝟐) + 𝟏 here 𝛀 (𝟐𝟐 ) ≤ 𝑻(𝒏) ≤ 𝑶(𝟐𝒏 ) so, 𝑻(𝒏) ≠
𝜽(𝟐𝒏 ).
nt
//Lecture 3d
𝑛
Q : Solve 𝑇(𝑛) = 𝑎𝑇 (𝑏 ) + 𝑐𝑛𝑘 –
ua
Q
𝟏 𝒊𝒇 𝒂 < 𝒃𝒌
𝐥𝐨𝐠 𝒃 𝒏 𝒊𝒇 𝒂 = 𝒃𝒌
𝒘𝒉𝒆𝒓𝒆 𝒈(𝒏) =
𝒂 𝐥𝐨𝐠𝒃 𝒏
( 𝒌) 𝒊𝒇 𝒂 > 𝒃𝒌
{ 𝒃
//Lecture 4a
In above example, we got one general formula where we can say 𝑓(𝑛) = 𝑛𝑘 . 𝑔(𝑛) but we extend this
idea to more general function f(n).
𝑛
We want to solve 𝑇(𝑛) = 𝑎𝑇 (𝑏 ) + 𝑓(𝑛), we will get following results (we will see proof later)
To apply this method, we first find nlogba then compare it with f(n).
ity
//Lecture 4b
Here one thing to note that when we compare f(n) with nlogba, f(n) should be polynomial greater than
𝑛
C
nlogba. For example, 𝑇(𝑛) = 2𝑇 (2 ) + 𝑛𝑙𝑜𝑔𝑛
um
nt
If instead of nlogn, 2n were there then 2n > 1 this is also polynomially greater because you can always
ua
//Lecture 5a
We know master theorem cannot solve recurrence relations in which f(n) is not polymonially
comparable to nlogba. To resolve this problem, we simply change 2nd condition of master theorem as
follows :
ity
𝑛
Meaning 𝑇(𝑛) = 2𝑇 (2 ) + 𝑛𝑙𝑜𝑔𝑛 C
//Lecture 5b
um
𝑛 𝑛
Now, can we solve 𝑇(𝑛) = 2𝑇 (2 ) + 𝑙𝑜𝑔𝑛 , 𝑇(2) = 1 ? – Yes, using substitution
nt
ua
Q
We cannot solve it using master theorem because we have solved log n case but it was in numerator,
𝑛 𝑛
Similarly, can we solve 𝑇(𝑛) = 2𝑇 ( ) + (𝑙𝑜𝑔𝑛)2 , 𝑇(2) = 1 ? – Yes, again we use substitution. This
2
time we will get,
Looks like we have to again extend the master theorem for log n in denominator case. We call this
final version of theorem as Extended master theorem.
1. If 𝑓(𝑛) = 𝑂(𝑛log𝑏 𝑎−𝜖 ) for some constant 𝜖 > 0, then 𝑇(𝑛) = 𝜃(𝑛log𝑏 𝑎 ).
2. If 𝑓(𝑛) = 𝜃(𝑛log𝑏 𝑎 log 𝑘 𝑛) with 𝑘 ≥ 0, then 𝑇(𝑛) = 𝜃(𝑛log𝑏 𝑎 log 𝑘+1 𝑛).
Algorithms
𝑛log𝑏 𝑎
• If 𝑓(𝑛) = 𝜃( log 𝑛
), then 𝑇(𝑛) = 𝜃(𝑛log𝑏 𝑎 log log 𝑛).
𝑛log𝑏 𝑎
• If 𝑓(𝑛) = 𝜃( logp 𝑛 ) with 𝑝 ≥ 2, then 𝑇(𝑛) = 𝜃(𝑛log𝑏 𝑎 ).
3. If 𝑓(𝑛) = Ω(𝑛log𝑏 𝑎+𝜖 ) for some constant 𝜖 > 0, then 𝑇(𝑛) = 𝜃(𝑓(𝑛)).
𝑛 𝑛
Q : solve 𝑇(𝑛) = 4𝑇 (2 ) + (𝑙𝑜𝑔𝑛)2 – here we do not directly say 𝑇(𝑛) = 𝜃(𝑛) by using 2nd part of
extended master theorem. We will first check
//Lecture 5d
ity
C
um
nt
ua
//Lecture 6a
As name indicate we replace existing variable by some variable to simplify recurrence relation.
𝑇(𝑛) = 𝑇(√𝑛) + √𝑛 To solve such type of problem, using substitution it will be difficult. So, we
assume n = 2m.
161
𝑛
Solve : 𝑇(𝑛) = (𝑇 (161)) . 𝑛 – When something related to power of T(n) is given we take log on
both sides.
Solve : 𝑇(𝑛) = √𝑛𝑇(√𝑛) + 100𝑛 – when coefficient is not constant then try to remove it or try to
make it another function. For example,
Q
ua
nt
Algorithms
um
C
ity
Algorithms
Break up a problem into smaller subproblems. Solve those subproblems recursively. Combine the
results of those subproblems to get the overall answer.
Total time to solve whole problem 𝑻(𝒏) = 𝒅𝒊𝒗𝒊𝒅𝒆 𝒄𝒐𝒔𝒕 + 𝑻(𝒏𝟏 ) + 𝑻(𝒏𝟐 ) + ⋯ + 𝑻(𝒏𝒌 ) +
𝑪𝒐𝒎𝒃𝒊𝒏𝒆 𝒄𝒐𝒔𝒕.
3 recursion rules :
1) Always have at least one case that can be solved without using recursion. (base case)
2) Any recursive call must progress toward a base case. (finite step loop)
3) Always assume that the recursive call works, and use this assumption to design your
algorithms.
//Lecture 7b
ity
1) Maximum of an array :
Maximum(a, l, r){
C
if(r == l) return a[r]; //Base case
m = (l+r)/2;
um
//Lecture 7c
2) Search in an array :
Q
3) Dumb sort :
//Lecture 8a
MergeSort(a, i, j){
if(i==j) return;
mid = (i+j)/2;
MergeSort(a, i, mid);
MergeSort(a, mid+1, j);
Merge(a, i, mid, j);
}
//Lecture 8b
Merge procedure should combine two array sorted array into one sorted array.
As both arrays are sorted so one of the first index of array contains minimum. We compare both values
then put the minimum value in new array of size m+n. Now we increment index of array from which
ity
minimum was selected. We do this procedure iteratively till both indexes of array points to last
element.
C
Merge(a, i, mid, j){
l1 = i, l2 = mid+1, k = 0; //k represents index of new array b[]
um
k++;
}
ua
}
//Lecture 8c
1) At worst we have to compare every element, it is obvious that maximum element would not
get compared with anyone at last because it is only remaining element. So, at max m + n – 1
comparison required.
2) At min we can have case where all elements of one array is less than other array. In that case
we only do min(n, m) comparisons.
//Lecture 9a
A sorting algorithm falls into the adaptive sort family if it takes advantage of existing order in its input.
It benefits from the pre-sorted sequence in the input sequence. Example, Merge sort is not adaptive
algo but insertion sort is.
//Lecture 9c
Algorithms
Note that no. of comparisons in merge sort is asked not in merge procedure. If merge is asked then
answer will be just n-1.
//Lecture 10a
2.1.4) Iterative merge sort : also known straight two-way merge sort algorithm or bottom up appro.
Idea is to form group of two then four … till array size. And iteratively sorting and merging elements
of subarray.
//Lecture 10c
ity
C
um
Asymptotically top down (recursive) and bottom up (iterative) approach are same. But
nt
Top down = split + combine (here are splitting it using int mid)
ua
Bottom up = only combines. (here we form group for that also we do groupsize*2 after every iteration)
Q
If we say merge sort by default it is top down (recursive) approach if explicitly mentioned about
bottom up by “straight two way” or by using other word.
//Lecture 11a
We will explore 5 methods to merge k sorted array into one sorted array.
1) Trivial method : we will combine k sorted array then sort it will take nlogn. But here we are
not taking advantage of pre-sorted sequence of element.
2) Successive Mergeing method :
3) Successive minimum method : we will take first element of all k sorted array we will find
ity
minimum repeatedly. To find first element of final sorted array we need k comparisons. So,
for n element we need nk comparison. So, time complexity is 𝜃(𝑛𝑘). Can we do better ?
C
we have observed that in successive minimum method if we have selected first minimum then we
increment a point to next node in respective sorted array. and we again find minimum and for that
um
we again have to traverse whole k size array. but we have already done comparison between element
which was previously used. So is there any method which provides better minimum finding ?
4) Improvement of 3rd method : we can improve 3rd method using heap data structure. First, we
nt
make min-heap out of all first elements from the k sorted arrays. Which will take O(k) time.
Then we extract root, place it into final sorted array and insert new element into min heap
ua
which will take O(lg k) time. For each n – k – 1 element we repeat this process which will take
total of O(nlog k + k) time = 𝜃(𝑛 log 𝑘).
Q
//Lecture 11b
Properties of sorting :
• Stable sorting : A sorting algorithm is stable if elements with the same key appear in the
output array in the same order as they do in the input array.
Algorithms
Note that if sequence of different sorted arrays is to be sorted using optimal merge algorithm then
algo chooses smallest sequences for merging.
//Lecture 12a
Finding maximum is like knockout tournament. First, we set first element as max then we compete
with second element whoever wins (large value) we update max. we do this till last element.
There can be many methods to find max, one of them is we do pairwise knockout tournament. we will
do n/2 + n/4 + … n/n comparisons. TC will be n-1. What if we want 1st runner or second max. We keep
tract of second max as when we set some new element to max we give this value to second max and
if while traversing we get value greater than second max but less than max then we set that value to
the second max.
#include<stdio.h>
ity
int main(){
int arr[] = {4, 2, 5, 6, 18, 1, 7};
int len = sizeof(arr)/sizeof(arr[0]), max1 = arr[0], max2;
C
for(int i = 1;i<len; i++){
if(max1 < arr[i]){
um
max2 = max1;
max1 = arr[i];
}
nt
//Lecture 12b
Q : Can we have a smatter method ? – yes using, pairwise tournament method. We do competition
between two number which one is largest we will move up in tree.
So, first we find maximum using n – 1 comparison and then we will find 2nd largest element from logn
element because 25 gets log n competitors in whole tournament. We will keep track of maximum and
second maximum for every node using any data structure. So, for second largest element we will have
total of n + log n – 2 comparisons.
//Lecture 12d
Algorithms
Method 1 : We first look at look at trivial approach. We traverse whole array and make two
comparison for max and min.
#include<stdio.h>
int main(){
int arr[] = {1, 2, 3, 6, 4, 8, 0};
int max, min, len;
max = min = arr[0];
len = sizeof(arr)/sizeof(arr[0]);
for(int i = 1;i<len;i++){
if(arr[i]>max) max = arr[i];
else if(arr[i]<min) min = arr[i];
}
printf("%d %d", max, min);
return 0;
}
ity
Method 2 : CLRS
#include<stdio.h>
C
int main(){
int arr[] = {1, 0, 3, 2, 9, -1, -2};
int max, min, len;
um
len = sizeof(arr)/sizeof(arr[0]);
if(len == 1){
printf("%d %d", arr[0], arr[0]);
nt
return 0;
}
ua
if(arr[0]>arr[1]){
max = arr[0];
min = arr[1];}
Q
else{
max = arr[1];
min = arr[0];
}
for(int i = 2;i<len-1;i+=2){
if(arr[i]>arr[i+1]){
if(arr[i]>max) max = arr[i];
if(arr[i+1]<min) min = arr[i+1];}
else{
if(arr[i+1]>max) max = arr[i+1];
if(arr[i]<min) min = arr[i];
}
}
printf("%d %d", max, min);
return 0;
}
Algorithms
//Lecture 13a
Method 3 : We also have one method which gives same result as CLRS method. But we do not do
simultaneous comparison. We use tournament method. We first divide array into two parts then we
do match with two number we separate all the winners and losers from first round. Then we find max
from winners and min from losers.
Divide or separating out winners and losers will take n/2 comparison. Finding max from winners will
take n/2 – 1 comparison and same for finding min from losers. At the end total comparison = 3n/2 –
2. This is comparison is only if number of elements are even if odd then we do previous procedure
with n-1 element as they will form even number group.
ity
C
um
nt
ua
Q
3𝑛 3𝑛
Conclusion : no. of comparison in find min-max when n is even is 2
− 2 and n is odd is 2
− 1.5
//Lecture 15a
Suppose Netflix is recommending some list of movies. How these recommendation works ?
Step 2 : recommend what I have not watched but user has watched. (this is easy to do)
Q : How to find a similar user to me ? – first we will find user who have watched same movies as you
and has given rate according to his choice. You also have rated movies. And suppose we arrange them
in sorted order of their rating. We simply count no. of inversion if they are less then we say we have
found similar user to you. We call this counting inversion problem.
//Lecture 15b
We say that two indices i<j forms an inversion if a[i] > a[j].
𝑛(𝑛 − 1)
0 ≤ 𝑇𝑜𝑡𝑎𝑙 𝑖𝑛𝑣𝑒𝑟𝑠𝑖𝑜𝑛𝑠 𝑤𝑖𝑡ℎ 𝑛 𝑒𝑙𝑒𝑚𝑒𝑛𝑡𝑠 ≤
2
Q : Suppose T is any array of n elements. Show that if there is an inversion (i, j) in T, then T has at least
j – i inversions. – It is given that there is inversion (i, j). Suppose we have 500 at i and 10 at j. Two cases
are possible,
ity
Suppose at random position between 500 and 10 we have number < 10 then there will definitely be
inversion with 500 and not with 10.
C
Case 2 : Number > 10 and number < 500
If Number is between 500 and 10 then number will have inversion with 500 and 10 as well.
um
Case 3 : Number > 500 … it will definitely have inversion with 10.
So, in every case we have at least 1 inversion which means for position between j – i we will have at
nt
least j – i inversions.
//Lecture 15c
ua
Method 1 : Brute force… we will visit each node and for every node will count inversions which will
take O(n) time. So, in total time complexity is O(n2).
Method 2 : Divide and conquer… we will get following recurrence relation at best
𝑛 𝑛2
𝑇(𝑛) = 2𝑇 ( ) + 𝑂(1) +
2 4
//Lecture 16a
Can we do it better using divide and conquer ? – Yes, using merge sort. In mergesort procedure we
will return inversion count.
At this stage return value of first sorted array is 2. And return value of second sorted array is 1. Which
means total inversion after this function call must be 2 + 1 + x this is x is number of extra inversion
when we merge arrays.
Algorithms
In merge procedure, first we compare 2 with 3 we know 3 and 2 forms inversion but 2 is also forming
inversion with rest of the elements. So, we say if some element in second array is less than corr.
Element in first array, we will count inversion as inversion + number of elements after 3 (inclusive).
Similarly, we will do this process until whole array is sorted. Here in merge procedure we just need to
add few if else statement to count inversion. So total time complexity of merge procedure will still be
o(n) only. Thus, Mergesort will take O(nlogn) and so inversion count.
ity
k++;
}
if(p>mid) copy remaining elements of 2nd half
C
if(q>j) copy remaining elements of 1st half
copy b to a...
um
return inversions;
}
nt
Now, merge will add all the right, left tree inversions with NewMerge inversions.
if(i==j) return 0;
int mid = (i+j)/2;
Q
Our target is to find a pair having least distance given n points with (x, y) coordinate.
Brute force approach would be to choose all possible pairs of n points and find their distances and
then find min from those distance. So, time taken is O(n2).
Better approach would be to use divide and conquer. First, we will divide group of points into small
subgroup and then we find shortest distance between those pairs and then combine.
𝑛
Time complexity will be 𝑇(𝑛) = 2𝑇 (2 ) + 𝑑𝑖𝑣𝑖𝑑𝑒 𝑐𝑜𝑠𝑡 + 𝐶𝑜𝑚𝑏𝑖𝑛𝑒 𝑐𝑜𝑠𝑡
Algorithms
//Lecture 17b
Here divides cost is O(1) as we just set limit for dividing purpose. For combine operation, we first form
a boundary from the points nearby divide line. And find min distance amongst those points and then
combine min. To find min distance first we sort each point according to their x coordinate and y
coordinate then find min distance. Assuming we have set our boundary as near as possible to the
divide line we can find min distance in o(n) time. Therefore,
𝑛
𝑇(𝑛) = 2𝑇 ( ) + 𝑂(𝑛) + 𝑂(1) = 𝑂(𝑛𝑙𝑜𝑔𝑛)
2
//Lecture 17c
2) Exponent of a number :
ity
//Using brute force : //Using DAC
exponent(a, n){ exponent(a, n){
if(n==1) return a; if(n==0) return 1;
C
return a*exponent(a, n-1); if(n==1) return a;
} x = exponent(a, n/2);
um
//Lecture 17d
ua
3) Matrix multiplication :
Q
Suppose we use DAC for multiplying matrix so we divide our problem into small sub-matrices. Here
matrix C is of order n and then we are dividing matrix into n/2 order. In the recursive case, we partition
in O(n) time, perform eight recursive multiplications of n/2 matrices, and finish up with the O(n2) work
from adding two n matrices. Recurrence relation would be
𝑛
𝑇(𝑛) = 8𝑇 ( ) + 𝑂(𝑛2 ) = 𝜃(𝑛3 )
2
But Strassen later proved that we need just 7 multiplication so 𝑇(𝑛) = 𝜃(𝑛𝑙𝑜𝑔7 )
//Lecture 18a
Idea is to fix one position of one element. To do that we select one value of element (pivot element)
randomly or by some strategy then we make partition of numbers less than and greater than pivot
element.
Algorithms
quicksort(a, i, j){
if(i == j) return;
q = partition(a, i, j);
quicksort(a, i, q-1);
quicksort(a, q-1, j);
}
Here partition can be done by any rule. For example, selecting first element of partition as pivot or
randomly selecting element as pivot.
while(i<=j){
while(i<=j and a[i]<=pivot) i++; //Finding first element from
left greater than pivot
while(i<=j and a[j]>pivot) j--; //Finding first element from
nt
}
swap(a[low], a[j]); //j<i meaning j points to last element less than
Q
Q : Is quicksort stable ? – Consider an array {5, 1, 6, 7, 2(1st), 4, 3, 2(2nd), 7, 8} Let’s say pivot is 6.
After swapping, it will become {5, 1, 6, 2(2nd), 2(1st), 4, 3, 7, 7, 8} here you can see that it is not
preserving relative position so quicksort is not stable. But it is in-place as can be seen from code.
Time complexity :
𝑛
Best case : if you are very lucky, partition splits the array evenly. 𝑇(𝑛) = 2𝑇 (2 ) + 𝜃(𝑛) + 𝜃(1)
Algorithms
But this is not only the best case when we divide partition into n/1000000 and 999999n/1000000 parts
then also time complexity is 𝜃(𝑛𝑙𝑜𝑔𝑛). So, in sort we can say that best case time complexity when
partition has following structure,
Worst case : On side of the partition has only one element. 𝑇(𝑛) = 𝑇(𝑛 − 1) + 𝑇(1) + 𝑇(𝑛) = 𝜃(𝑛2 )
Q : Recall the partition subroutine employed by the quicksort algorithm, as explained in lecture. What
is the probability that, with a randomly chosen pivot element, the partition subroutine produces a
split in which the size of the smaller of the two subarrays is >= n/5 ? -
ity
C
um
nt
//Lecture 19a
ua
So, select algorithm is the program which can find kth smallest element of A.
We can simply sort array and return index of kth smallest element. But can we do it better ?
We use quicksort partition algo. We know that partition algo returns index of element which has been
shorted. So, idea is
We are definitely sure that 5th smallest element is 6 so if we want 3rd smallest it will be left half of
partitioned array. Therefore,
select(A, p, q, k){
if(p==q) return A[0];
m = partition(A, p, q);
if(m==k) return a[m];
if(k<m) return select(A, p, m-1, k);
if(k>m) return select(A, m+1, q, k-m); //k-m because in right array
indexing starts 0 from m+1 index onwards
}
Time complexity :
Worst case : 𝑇(𝑛) = 𝑇(𝑛 − 1) + 𝜃(𝑛) = 𝜃(𝑛2 ) not equal partition. So, this is even worst but there is
one algo called median of median (which is extremely clever) that gives approximate median. So, you
can find kth smallest in 𝑂(𝑛).
//Lecture 19b
ity
Given sorted array, search for an element
BinarySearch(A, i, j, key){
C
if(i==j) if(a[i]==key) return i;
else return -1;
um
mid = (i+j)/2;
if(a[mid]==key) return mid;
if(a[mid]>key) return BinarySearch(A, i, mid-1, key);
nt
//lecture 20a
BubbleSort(a){
for(i = 1 to n-1){
for(j = 1 to n-1){
if(a[j]>a[j+1]) swap(a[j], a[j+1]);
}
}
}
But sometimes before ending pass array would be sorted in that case there is no point of traversing
array for sorting so we use Boolean variable. At the start of every pass we assign false statement
assuming pair is unsorted and then once we swapped, we set value true meaning two pairs are sorted.
And after every pass we check if some swapped occurred. If no swapping in inner loop then stop.
BubbleSort(a) :
for(i = 1 to n-1){
swapped = false;
Algorithms
for(j = 1 to n-1){
if(a[j]>a[j+1]){
swap(a[j], a[j+1]);
swapped = true;
}
}
if(swapped == false) break;
}
ity
C
InsertionSort(A) :
for(j = 2 to A.length){
um
key = A[j];
//Insert A[j] into the sorted sequence A[1..j-1]
i = j-1;
nt
i = i-1;
}
A[i+1] = key;
Q
If array is sorted in reverse order then there will be maximum number of inversions (n-1)+(n-2)+…+1
𝑛(𝑛−1)
0+ 𝑛(𝑛−1)
2
= n(n-1)/2 and average number of inversions will be = .
4 4
Thus, running time of insertion sort is 𝜃(𝑛 + 𝑑) where d is the number of inversions. Merge procedure
can be modified to count number of inversions in 𝜃(𝑛𝑙𝑜𝑔𝑛) time.
//Lecture 20d
This is simplest algorithm; on each iteration it finds maximum and put it in last position in array and
decreasing size of array after each iteration.
SelectionSort(A) :
length = n;
for(pass = 1; pass<=n-1; pass++){
max = A[0];
Algorithms
maxIdx = 0;
for(j = 1;j<length;j++){
if(a[j]>max){
maxIdx = j;
max = a[j];
}
}
swap(a[maxIdx], a[length-1]);
length--;
}
ity
Time complexity : 𝜃(𝑛) + 𝜃(𝑛𝑙𝑜𝑔𝑛)
Summery,
ua
Among 𝑂(𝑛2 ) sorts, insertion sort is usually better than selection sort and both are better than bubble
sort.
//Lecture 20e
Theorem : Any comparison-based sorting algorithm must take atleast n log n time in worst case.
You have noticed that in every algorithm we are comparing two elements. Any comparison-based
algorithm needs to be able to output any one of n! possible orderings.
Algorithms
You also know that no. comparisons you are doing to obtain one of the possible ordering is 𝑂(𝑑𝑒𝑝𝑡ℎ).
You can observe that above diagram is nearly complete binary tree with L leaves and d depth where L
= n! and depth should be 𝑑 ≥ log 2 𝐿 or 2𝑑 ≥ 𝐿. d is nothing but comparisons so no. of comparison
should Ω(𝑛𝑙𝑜𝑔𝑛). QED.
//Lecture 20f
Assumptions :
ity
• Maximum value of any key is <= k
1) Counting sort : It takes value and push it into bucket of that index then it pop element from
C
left to right from bucket and insert into new array.
um
nt
ua
2) Radix sort : We perform counting sort on the least-significant digit first, then perform counting
sort on the next least-significant, and so on…
How long does each iteration take ? – O(n+k) but if we are taking only integer then k = 9 so O(n) only.
Q : How good is O(nd) ? – O(nd) isn’t so great if we are sorting n integers in base 10, each of which is
in range {1, 2, 3, 4, …, nt}
Algorithms
ity
C
um
nt
ua
Q
Algorithms
3. Graph in algorithm
//Lecture 21a
Notations :
Graph can be represented as G = (V, E) where V = nodes (or vertices) and E = edges.
Undirected-graph connectivity :
An undirected graph is connected if for all pairs of vertices u, v there exists a path from u to v.
A undirected graph is complete, aka fully connected if for all pairs of vertices u, v there exists an edge
from u to v.
ity
A directed graph is strongly connected if there is a path from every vertex to every other vertex
C
um
A directed graph is weakly connected if there is a path from every vertex to every other vertex ignoring
direction of edges.
NOTE : if connected graph is mentioned without mentioning directed or undirected then consider
nt
undirected because concept of connected is only exists in undirected for directed we have strongly
connected and so on concept.
ua
Storing graphs : Need to store both the sets of nodes V and set of edges E.
Adjacency list :
//Lecture 21b
Algorithms
In adjacency list you may have observed that number of external nodes is equal to 2E in case of
undirected graph (because each list represents degree of that node). And In case of directed graph,
external node = E = sum of out-degree of all vertex.
Algorithm runtime : Graph algorithm runtimes depend on |V| and |E|. and relation between V and E
depend upon density of graph. Sometimes we have complete graph which dense graph in that case
|𝐸| ≈ |𝑉|2 . If graph is spare, |𝐸| ≈ |𝑉|.
ity
C
Commonly used search methods : breadth-first search and depth-first search.
um
Brute force method is to select one node put it into bag then extract some node from bag if it is
nt
unmarked then mark them and put neighbors of that extracted marked node into bag. Repeat this
process till all the nodes are marked.
ua
Whatever-first-search(G, s) :
put s in the bag
Q
Thumb rule is if there are no new nodes during traversal to visit then backtrack.
Suppose we define start time or discover time (time at which we visit vertex for the first time) and
finish time (time at which we visit vertex last time)
Algorithms
//Lecture 21d
1) Implementing DFS :
DFS_Visit(u) :
visited[u] = true
for each v adjacent to u :
if not visited[v] : DFS_Visit(v)
Above function will mark nodes and visits its neighbors and mark them as well.
DFS(G) :
for each vertex u in vertices
visited[u] = false
for each u in vertices
if(visited[u] == false) DFS_Visit(u)
ity
C
um
//Lecture 22a
nt
In any depth first search of a directed or undirected graph G = (V, E), for any two vertices u and v,
exactly one of the following three conditions holds :
Q
• The intervals [u.d, u.f] and [v.d, v.f] are entirely disjoint, and neither u nor v is a
descendant of the other in the depth-first forest.
• The interval [u.d, u.f] is contained entirely within the interval [v.d, v.f], and u is a
descendent of v in a depth-first tree, or
• The interval [v.d, v.f] is contained entirely within the interval [u.d, u.f] and v is a
descendent of u in a depth-first tree.
Q : Given the following question on dfs of directed graph. Reconstruct the DFS tree.
Algorithms
//Lecture 22b
Forward edge : an edge which is not a tree edge and goes from ancestor to descendent
Back edge : Not a tree edge and goes from descendent to ancestor.
Cross edge : not a tree edge and neither ancestor nor descendent.
ity
C
um
//Lecture 23a
nt
Topological sort :
Q : How to find articulation points through DFS-based approach ? – we can have several cases based
on this …
Algorithms
Case 1 : When can root of DFT be AP ? – only when it has more than one children if it has one child
then deletion of that root can’t be AP.
Case 2 : when can non-root of DFT be AP ? – if and only if none of descendent are directly connected
to ancestor. For example,
But what if 6 has two children and one of its children is connected to 6’s ancestor in that case also we
have 6 as AP but according to our previous sentence 6 can’t be AP so we have to modify above case 2
statement.
A non-root of DFT is AP if and only if there exists one subtree of node such that none of it’s descendent
are directly connected to ancestor.
Now, talking about implementation so we just note down start time of each node in one array. and
ity
another array to maintain lowest start time node connected to a particular node (this will keep tract
of back edges).
C
//Lecture 24a
BFS(G, v) :
Q
//Lecture 24c
2) Application of BFS :
Bipartite graph : A graph is bipartite if and only if there is not odd length cycle.
Now, we know that in BFS tree only tree and cross edges exists so we have two cases:
Case 1 : no cross edge then definitely graph is bipartite. Because we can always combine alternate
level as one group.
ity
C
Talking about implementation, so we check cross edges in between nodes in same level if yes then
um
Finding components : Both BFS and DFS can be used to find number of components of graph for DFS
nt
we have already seen but for BFS we maintain array in which we assign number which corresponds to
a particular component and then we will see if any element is having 0 number which implies separate
ua
component.
We simply maintain a global array of vertices which keep track of visited vertices.
So, we can use both BFS and DFS for finding path from any two vertex.
Algorithms
4. Greedy algorithms
//Lecture 26a
Suppose you went to shop and shopkeeper asks you for change of 30 rupees. Currently you have one
25 Rs., three 10 Rs. So, you follow greedy algorithm which says select best possible ways available at
that moment. Your objective is to given change but using as minimal notes as possible. Therefore,
you select 25 Rs. Which look best at the moment but now you do not have 5 Rs. Change so this method
fails.
Here we talk about single source shortest path problem in which we have to find shortest path in
terms of weights from single source to final destination.
Attempt 1 : find shortest among all possible paths. This is brute force approach. But consider one
scenario
ity
C
Attempt 2 : Use of BFS. But when we form tree then BFS does not care about the weights. But we can
um
always change the weight of edge into small dummy nodes whose edge weight is 1. Therefore, we can
always change the graph into another graph. For example,
nt
ua
Final attempt : We go to a particular node from source and then try to find min distance from source
Q
to that node and we repeat the process till we have traverse every node. This is called Dijkstra
algorithm.
In Dijkstra algorithm each node have one value which denotes that shortest possible distance at that
moment of time. Initially they all have ∞ (ideally) after visiting we update the value.
Algorithms
So, we pick min then go to that node then relax all outgoing edges and we repeat this process until we
have visited all nodes. Relax means assign min distance to neighbor of selected node.
RELAX(u, v, w) :
if (d[v]>d[u]+w){
d[v] = d[v] + w
parent[v] = u
}
Dijkstra code :
Dijkstra(G, s):
1 key[v] = ∞ for all v in V
2 key[s] = 0
3 S = ∅
4 initialize priority queue Q to all vertices
ity
7 S = S U {u}
Time complexity :
One thing you may have noticed that we are relaxing any edge exactly once.
nt
Algorithm maintains the min-priority queue Q by calling three priority-queue operations: INSERT
(implicit in line 4), EXTRACT-MIN (line 6), and DECREASE-KEY (implicit in RELAX, which is called in line
ua
9).
ity
In first iteration Dijkstra will assume that 2 is min and there is no way that there will be a path from
node 4 also because subpath of shortest path should be shortest path. So, it will ignore node 3 by
C
removing it from queue. So how to overcome this issue.
Idea : add some weight to make every weight as non-negative and apply Dijkstra ? – this looks correct
um
You may encounter cycle in graph, but one thing to note that here from beginning we have assumed
that graph is directed if graph is undirected then one negative edge can make cycle. For example,
Q
We want to build shortest path algorithm so that it can at least detect negative weight because if there
is negative cycle then it may go to infinite loop.
//Lecture 28a
You can apply Dijkstra algorithm but there is special thing about this graph that if negative weight
edge is there than also, we can every time find shortest path because of absence of cycle.
Algorithms
One thing to note that when we come at node “a” then we are sure that it is shortest possible path
result so there is no need to find traverse to a from any other vertex because there is no cycle.
Therefore, we can say that Dijkstra : Pick min and relax outgoing edges. And DAG shortest path : pic
in topological order and relax outgoing edges.
There is one algo Bellman ford : relax edges in any order (V-1) times.
//Lecture 28b
ity
4.1.3) Bellman ford algorithm :
If we relax in the order of shortest path (along with intermixed other relaxation) then we will get
C
shortest path cost
This is true because there is always one shortest relaxation sequence exists. If we insert more
um
relaxation in between this sequence then also final shortest path cost will be same. To find shortest
path shortest relaxation sequence at least must be covered. This is called path-relaxation property.
nt
For example,
ua
Q
So, if you relax all element of relaxation path sequence then we are sure that all edges are relaxed and
we have shortest path cost. Therefore, we first relax all the edges we know that 5 edges must be
relaxed so we run relaxation for each edge V – 1 (here 4 times).
Example,
We can extend the relaxation idea to find negative cycle in graph. To do that we first relax all edge V
-1 times then we know that we definitely have shortest path. And if do one more time then also it
should not matter as all edges are already relaxed. But in case of negative cycle if we do one more
time then it will be given different cost at some nodes so after V -1 relaxation, we do it one more time
to just check negative cycle. If cost is same then no negative cycle present if different then negative
cycle definitely exists.
Bellman_ford(G, s):
d[v] = ∞ for all v in V
d[s] = 0
for i = 1 to V - 1
for each edge (u, v) in E
RELAX(u, v, w)
for each edge (u, v) in E
if(d[v]>d[u]+w)
return false
return true
Dijkstra is fast but it does not work for negative weight edge.
ity
If we run k times then we have shortest path to all vertices which are having at most k edges
//Lecture 28d
C
Q : Why bellman ford works ? – Suppose below is shortest path
um
Claim : our algorithm will discover shortest path not more than k steps.
nt
In 1st iteration : we are sure that somewhere S-V1 this edge will relax and never gets changed because
it is shortest.
ua
In 2nd iteration : we are sure that somewhere in relaxation sequence v1-v2 edge will relax and its min
cost will never get changed because it made up of two shortest paths.
Q
Same up to k th iteration : we are sure that Vk-1-Vk edge will definitely relax. Thus, we have relaxed all
the shortest path till kth iteration.