Lec 01 Why Ds
Lec 01 Why Ds
IITB India
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 1
Next course in programming
In CS101, you learned to walk. In this course, you will learn to dance.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 2
What is data?
Example 1.1
Age of people, height of trees, price of stocks, and number of likes.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 3
Data is big!
We are living in the age of big data!
Exercise 1.1
1. Estimate the number of messages exchanged for status level in Whatsapp.
2. How much text data was used to train ChatGPT?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 4
We need to work on data
Example 1.2
1. Predict the weather
2. Find a webpage
3. Recognize fingerprint
Exercise 1.2
How much time do we need to find an element in an array?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 5
Problems
Definition 1.1
A problem is a pair of an input specification and an output specification.
Example 1.3
The problem of search consists of the following specifications
▶ Input specification: an array S of elements and an element e
▶ Output specification: position of e in S if it exists. If it is not found, return -1.
Exercise 1.3
According to the specification, what should happen if e occurs multiple times in S?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 6
Algorithms
Definition 1.2
An algorithm solves a given problem.
▶ Input ∈ Input specifications
▶ Output ∈ Output specifications
Exercise 1.4
1. What is an algorithm?
2. How is it different from a program?
Commentary: An algorithm is a step-by-step process that processes a small amount of data in each step and eventually computes the output. The formal definition of the
algorithm will be presented to you in CS310. It took the genius of Alan Turing to give the precise definition of an algorithm.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 7
Example: an algorithm for search
Example 1.4
int search ( int * S , int n , int e ) {
// n is the length of the array S
// We are looking for element e in S
for ( int i =0; i < n ; i ++ ) {
if ( S [ i ] == e ) {
return i ;
}
}
return -1; // Not found
}
Exercise 1.5
What is the running time of the above algorithm if e is not in S?
Commentary: Answer: We count memory accesses, arithmetic operations (including comparisons), assignments, and jumps. The loop in the program will iterate n
times. In each iteration, there will be one memory access S[i] , three arithmetic operations i<n , S[i] == e and i++ , and two jumps. At the initialization, there is an
assignment i=0 . For the loop exit, there will be one more comparison and jump. Time = nTRead + (3n + 2)TArith + (2n + 1)Tjump + Treturn Give this program to
https://godbolt.org/
cbna andCS213/293
see the assembly. Check if and
Data Structure the above analysis
Algorithms 2025is faithful! Instructor: Ashutosh Gupta IITB India 8
Data needs structure
Storing data as a pile of stuff, will not work. We need structure.
Example 1.5
Store files in the order of the year. How do we store data at IIT Bombay Hospital?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 9
Structured data helps us solve problems faster
We can exploit the structure to design efficient algorithms to solve our problems.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 10
Example: search on well-structured data
Example 1.6
Let us consider the problem of search consisting of the following specifications
▶ Input specification: a non-decreasing array S and an element e
▶ Output specification: Position of e in S. If not found, return −1.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 11
Example: search on well-structured data
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 12
A better search
Big-O notation
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 14
How much resource does an algorithm need?
Commentary: Sometimes there is a trade-off between time and space. For example, the inefficient linear search only needed one extra integer, but the binary search used
three extra integers. The difference between two integers may be a minor issue, but it illustrates the trade-off.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 15
Input size
We define the rough size of the input, usually in terms of important parameters of input.
Example 1.8
In the problem of search, we say that the number of elements in the array is the input size.
Please note that the size of individual elements is not considered. (Why?)
Commentary: Ideally, the number of bits in the binary representation of the input is the size, which is too detailed and cumbersome to handle. In the case of search, we
assume that elements are drawn from the space of size 232 and can be represented using 32 bits. Therefore, the type of the element was int .
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 16
Best/Average/Worst case
For a given size of inputs, we may further make the following distinction.
1. Best case: Shortest running time for some input.
2. Worst case: Worst running time for some input.
3. Average case: Average running time on all the inputs of the given size.
Exercise 1.7
How can we modify almost any algorithm to have a good best-case running time?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 17
Example: Best/Average/Worst case
Example 1.9
int BinarySearch ( int * S , int n , int e ){
// S is a sorted array
int first = 0 , last = n ;
int mid = ( first + last ) / 2; In BinarySearch, let n = 2k−1 .
while ( first < last ) { 1. Best case: e == S[n/2]
if ( S [ mid ] == e ) return mid ; TRead + 6TArith + Treturn ,
if ( S [ mid ] > e ) { 2. Worst case:e ∈ /S
last = mid ; We have seen the worst case.
} else { 3. The average case is roughly equal to
first = mid + 1; the worst case because most often
} the loop will iterate k times. (Why?)
mid = ( first + last ) / 2;
} Commentary: Analyzing the average case is usually
involved. For some important algorithms, we will do a
return -1; detailed average time analysis.
}
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 18
Asymptotic behavior
For short inputs, an algorithm may use a shortcut for better running time.
To avoid such false comparisons, we look at the behavior of the algorithms in limit.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 19
Big-O notation: approximate measure
Definition 1.3 g (n)
Let f and g be functions N → N. We say f (n) ∈ O(g (n)) if there are
c and n0 such that f (n)
Exercise 1.8
Which of the following are the true statements?
▶ 5n + 8 ∈ O(n) ▶ n2 + n ∈ O(n2 )
▶ 5n + 8 ∈ O(n2 ) ▶ 500000000000000000000000n2 ∈ O(n2 )
▶ 5n2 + 8 ∈ O(n) ▶ 50n2 logn + 60n2 ∈ O(n2 logn)
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 20
Example: Big-O of the worst case of BinarySearch
Example 1.10
Exercise 1.9
Prove that f ∈ O(g ) and g ∈ O(h), then f ∈ O(h).
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 21
What does Big O say?
Expresses the approximate number of operations executed by the program as a function of input
size
Hierarchy of algorithms
▶ O(log n) algorithm is better than O(n)
▶ We say O(log n) < O(n) < O(n2 ) < O(2n )
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 22
Complexity of a problem
The complexity of a problem is the complexity of the best-known algorithm for the problem.
Exercise 1.10
What is the complexity of the following problem?
▶ sorting an array O(n2 ) ✗
Best algorithm is
▶ matrix multiplication O(n3 ) ✗
still not known
Exercise 1.11
What is the best-known complexity for the above problems?
n
n0
There are more variations of the above definition. Please look at the end.
Exercise 1.12
a. Does the worst-case complexity of BinarySearch belong to Θ(log n)?
b. If yes, give c1 , c2 , and n0 for the application of the above definition on BinarySearch.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 24
Names of complexity classes
▶ Constant: O(1)
▶ Logarithmic: O(logn)
▶ Linear: O(n)
▶ Quadratic: O(n2 )
▶ Polynomial : O(nk ) for some given k
▶ Exponential : O(2n )
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 25
Topic 1.2
Tutorial Problems
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 26
Problem: Compute the exact running time of insertion sort.
Exercise 1.13
The following is the code for insertion sort. Compute the exact worst-case running time of the
code in terms of n and the cost of doing various machine operations.
for ( int j = 1; j < n ; j ++ ) {
int key = A [ j ];
int i = j -1;
while ( i >= 0 ) {
if ( A [ i ] > key ) {
A [ i +1] = A [ i ];
} else {
break ;
}
i - -;
}
A [ i +1] = key ;
cbna
} CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 27
Problem: additions and multiplication
Exercise 1.14
What is the time complexity of binary addition and multiplication? How much time does it take to
do unary addition?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 28
Problem: hierarchy of complexity
Exercise 1.15
Given f (n) = a0 n0 + ... + ad nd and g (n) = b0 n0 + ... + be ne with d > e and ad > 0 (Why?) , show
that f (n) ∈
/ O(g (n)).
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 29
Topic 1.3
Problems
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 30
True or False
Exercise 1.16
Mark the following statements True / False. Also provide justification.
1. For any function f : N → N, O(f ) ⊆ Ω(f ).
2. For a fixed array of size 2k for integer k, the binary search always takes the same amount of time in the case of an unsuccessful search.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 31
Order of functions
Exercise 1.17
f (n) F (n)
▶ If f (n) ≤ F (n) and G (n) ≥ g (n) (in order sense) then show that ≤ .
G (n) g (n)
▶ Is f (n) the same order as f (n)|sin(n)|?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 32
Exercise: an important complexity class!
Exercise 1.18
Prove that O(log(n!)) = O(n log n). Hint: Stirling’s approximation
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 33
Exercise: egg drop problem**
Exercise 1.19
In the dead of night, a master jewel thief is plotting the heist of a lifetime-stealing the most valuable
Faberge Egg from a towering 100-story museum. Each floor of the building has an identical egg, but the
higher the floor, the more valuable the egg becomes. However, there’s a catch. The thief can steal only one
egg and she knows that the most valuable egg at the top may not survive a drop from such a great height.
To avoid smashing her prized loot, she must identify the highest floor from which an egg can be dropped
without breaking. Armed with two replica eggs from the museum’s gift shop-perfectly identical but utterly
worthless-the thief devises a plan. These two eggs will be her test subjects, sacrificed in the pursuit of the
perfect drop. But time is of the essence, and the thief can not afford to be caught by the museum guards.
She needs to figure out the minimum number of test drops required to guarantee finding the highest safe
floor. Once an egg is broken, it’s gone for good-no replacements, no second chances. She cannot use any
other method to determine the sturdiness of the eggs.
a. Give an algorithm for the thief to determine, with the least number of drops in the worst case, the
highest floor from which an egg can be safely dropped without breaking. (Quiz 2024)
b. Give an algorithm for the best average case, assuming that the probability of the highest safe floor is
uniformly distributed.
c. Prove optimality of your algorithm.***
Commentary: https://www.youtube.com/watch?v=NGtt7GJ1uiM
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 34
Identities for Big-O (Midsem 2024)
Definition 1.5
Let A and B be subsets of p(N → N). A + B = {f + g |f ∈ A ∧ g ∈ B}.
Exercise 1.20
Prove/Disprove the following:
▶ O(f )g ⊆ O(fg )
▶ O(f ) + O(g ) ⊆ O(f + g )
Exercise 1.21
Can we give examples when the above subset relations are strict?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 35
Topic 1.4
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 36
Ω notation
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 37
Small-o,ω notation
Exercise 1.22
a. Prove that f ∈ o(g ) implies f ∈ O(g ).
b. Show that f ∈ O(g ) does not imply f ∈ o(g ).
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 38
Size of functions
We can define a partial order over functions using the above notations
▶ f (n) ∈ O(g (n)) implies f (n) ≤ g (n)
▶ f (n) ∈ o(g (n)) implies f (n) < g (n)
▶ f (n) ∈ Ω(g (n)) implies f (n) ≥ g (n)
▶ f (n) ∈ ω(g (n)) implies f (n) > g (n)
▶ f (n) ∈ Θ(g (n)) implies f (n) = g (n)
Exercise 1.23
Show that the partial order is well-defined.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 39
End of Lecture 1
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 40