0% found this document useful (0 votes)
10 views40 pages

Lec 01 Why Ds

The document outlines a course on Data Structures and Algorithms, emphasizing the importance of understanding data and algorithms for effective problem-solving. It introduces key concepts such as data, algorithms, problem definitions, and the significance of structured data in optimizing search operations. Additionally, it discusses algorithm efficiency, Big-O notation, and complexity classes, providing a foundation for analyzing and comparing algorithms.

Uploaded by

samiya.kitm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views40 pages

Lec 01 Why Ds

The document outlines a course on Data Structures and Algorithms, emphasizing the importance of understanding data and algorithms for effective problem-solving. It introduces key concepts such as data, algorithms, problem definitions, and the significance of structured data in optimizing search operations. Additionally, it discusses algorithm efficiency, Big-O notation, and complexity classes, providing a foundation for analyzing and comparing algorithms.

Uploaded by

samiya.kitm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

CS213/293 Data Structure and Algorithms 2025

Lecture 1: Why should you study data structures?

Instructor: Ashutosh Gupta

IITB India

Compile date: 2025-01-30

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 1
Next course in programming

Are CS101 and SSL not enough to be a programmer?

In CS101, you learned to walk. In this course, you will learn to dance.

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 2
What is data?

Things are not data, but information about them is data.

Example 1.1
Age of people, height of trees, price of stocks, and number of likes.

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 3
Data is big!
We are living in the age of big data!

*Image is from the Internet.

Exercise 1.1
1. Estimate the number of messages exchanged for status level in Whatsapp.
2. How much text data was used to train ChatGPT?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 4
We need to work on data

We process data to solve our problems.

Example 1.2
1. Predict the weather
2. Find a webpage
3. Recognize fingerprint

Disorganized data will need a lot of time to process.

Exercise 1.2
How much time do we need to find an element in an array?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 5
Problems

Definition 1.1
A problem is a pair of an input specification and an output specification.

Example 1.3
The problem of search consists of the following specifications
▶ Input specification: an array S of elements and an element e
▶ Output specification: position of e in S if it exists. If it is not found, return -1.

Output specifications refer to the


variables in the input specifications

Exercise 1.3
According to the specification, what should happen if e occurs multiple times in S?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 6
Algorithms

Definition 1.2
An algorithm solves a given problem.
▶ Input ∈ Input specifications
▶ Output ∈ Output specifications

Input Algorithms Output

Note: There can be many algorithms to solve a problem.

Exercise 1.4
1. What is an algorithm?
2. How is it different from a program?
Commentary: An algorithm is a step-by-step process that processes a small amount of data in each step and eventually computes the output. The formal definition of the
algorithm will be presented to you in CS310. It took the genius of Alan Turing to give the precise definition of an algorithm.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 7
Example: an algorithm for search

Example 1.4
int search ( int * S , int n , int e ) {
// n is the length of the array S
// We are looking for element e in S
for ( int i =0; i < n ; i ++ ) {
if ( S [ i ] == e ) {
return i ;
}
}
return -1; // Not found
}
Exercise 1.5
What is the running time of the above algorithm if e is not in S?
Commentary: Answer: We count memory accesses, arithmetic operations (including comparisons), assignments, and jumps. The loop in the program will iterate n
times. In each iteration, there will be one memory access S[i] , three arithmetic operations i<n , S[i] == e and i++ , and two jumps. At the initialization, there is an
assignment i=0 . For the loop exit, there will be one more comparison and jump. Time = nTRead + (3n + 2)TArith + (2n + 1)Tjump + Treturn Give this program to
https://godbolt.org/
cbna andCS213/293
see the assembly. Check if and
Data Structure the above analysis
Algorithms 2025is faithful! Instructor: Ashutosh Gupta IITB India 8
Data needs structure
Storing data as a pile of stuff, will not work. We need structure.

Example 1.5
Store files in the order of the year. How do we store data at IIT Bombay Hospital?
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 9
Structured data helps us solve problems faster

We can exploit the structure to design efficient algorithms to solve our problems.

The goal of this course!

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 10
Example: search on well-structured data

Example 1.6
Let us consider the problem of search consisting of the following specifications
▶ Input specification: a non-decreasing array S and an element e
▶ Output specification: Position of e in S. If not found, return −1.

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 11
Example: search on well-structured data

Let us see how we can exploit the structured data!

Let us try to search 68 in the following array.

▶ Look at the middle point of the array. 0 1 2 3 4 5 6 7 8 9 10


▶ Since the value at the middle point is less 11 21 35 46 49 60 68 73 81 90 91
than 68, we search only in the upper half.
▶ We have halved our search space.
▶ We recursively half the space.

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 12
A better search

Example 1.7 Commentary: Answer: There will be k iterations. In


each iteration, the function will follow the same path.
In each iteration, there will be
int BinarySearch ( int * S , int n , int e ){ ▶ a memory access S[mid] ,(why only one)
// S is a sorted array ▶ five arithmetic operations first < last ,
S[mid] == e , S[mid] > e , first+last ,
int first = 0 , last = n ; and ../2 ,
int mid = ( first + last ) / 2; ▶ one assignment last = mid , (Why?)
▶ three jumps because of two ifs and a loop
while ( first < last ) { ending,
if ( S [ mid ] == e ) return mid ; For the loop exit, there will be one additional compari-
son and a jump at the loop head. In the initialization
if ( S [ mid ] > e ) { section, we have two assignments and two arithmetic
operations.
last = mid ; Time = kTRead + (6k + 5)TArith + (3k + 1)Tjump +
Treturn
} else {
first = mid + 1;
}
Exercise 1.6
Let n = 2k−1 . How much time will it take
mid = ( first + last ) / 2;
to run the above algorithm if S[0] > e?
}
return -1;
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 13
Topic 1.1

Big-O notation

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 14
How much resource does an algorithm need?

There can be many algorithms to solve a problem.

Some are good and some are bad.

Good algorithms are efficient in


▶ time and
▶ space.

Our method of measuring time is cumbersome and machine-dependent.

We need approximate counting that is machine-independent.

Commentary: Sometimes there is a trade-off between time and space. For example, the inefficient linear search only needed one extra integer, but the binary search used
three extra integers. The difference between two integers may be a minor issue, but it illustrates the trade-off.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 15
Input size

An algorithm may have different running times for different inputs.

How do we think about comparing algorithms?

We define the rough size of the input, usually in terms of important parameters of input.

Example 1.8
In the problem of search, we say that the number of elements in the array is the input size.

Please note that the size of individual elements is not considered. (Why?)

Commentary: Ideally, the number of bits in the binary representation of the input is the size, which is too detailed and cumbersome to handle. In the case of search, we
assume that elements are drawn from the space of size 232 and can be represented using 32 bits. Therefore, the type of the element was int .
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 16
Best/Average/Worst case

For a given size of inputs, we may further make the following distinction.
1. Best case: Shortest running time for some input.
2. Worst case: Worst running time for some input.
3. Average case: Average running time on all the inputs of the given size.

Exercise 1.7
How can we modify almost any algorithm to have a good best-case running time?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 17
Example: Best/Average/Worst case
Example 1.9
int BinarySearch ( int * S , int n , int e ){
// S is a sorted array
int first = 0 , last = n ;
int mid = ( first + last ) / 2; In BinarySearch, let n = 2k−1 .
while ( first < last ) { 1. Best case: e == S[n/2]
if ( S [ mid ] == e ) return mid ; TRead + 6TArith + Treturn ,
if ( S [ mid ] > e ) { 2. Worst case:e ∈ /S
last = mid ; We have seen the worst case.
} else { 3. The average case is roughly equal to
first = mid + 1; the worst case because most often
} the loop will iterate k times. (Why?)
mid = ( first + last ) / 2;
} Commentary: Analyzing the average case is usually
involved. For some important algorithms, we will do a
return -1; detailed average time analysis.
}
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 18
Asymptotic behavior

For short inputs, an algorithm may use a shortcut for better running time.

To avoid such false comparisons, we look at the behavior of the algorithms in limit.

Ignore hardware-specific details


▶ Round numbers 100000000000001 ≈ 100000000000000
▶ Ignore coefficients 3kTArith ≈ k

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 19
Big-O notation: approximate measure
Definition 1.3 g (n)
Let f and g be functions N → N. We say f (n) ∈ O(g (n)) if there are
c and n0 such that f (n)

f (n) ≤ cg (n) for all n ≥ n0 .

▶ In limit, cg (n) will dominate f (n)


n
▶ We say f (n) is O(g (n)) n0

Exercise 1.8
Which of the following are the true statements?
▶ 5n + 8 ∈ O(n) ▶ n2 + n ∈ O(n2 )
▶ 5n + 8 ∈ O(n2 ) ▶ 500000000000000000000000n2 ∈ O(n2 )
▶ 5n2 + 8 ∈ O(n) ▶ 50n2 logn + 60n2 ∈ O(n2 logn)
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 20
Example: Big-O of the worst case of BinarySearch

Example 1.10

In BinarySearch, let n = 2k−1 .


1. Worst case:e ∈
/S
kTRead + (6k + 5)TArith + (3k + 1)Tjump + Treturn ∈ O(k)
We may also say
Since k = log n + 1, therefore k ∈ O(log n) BinarySearch is O(log n).

Therefore, the worst-case running time of BinarySearch is O(log n).

Exercise 1.9
Prove that f ∈ O(g ) and g ∈ O(h), then f ∈ O(h).

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 21
What does Big O say?

Expresses the approximate number of operations executed by the program as a function of input
size

Hierarchy of algorithms
▶ O(log n) algorithm is better than O(n)
▶ We say O(log n) < O(n) < O(n2 ) < O(2n )

May hide large constants!!

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 22
Complexity of a problem

The complexity of a problem is the complexity of the best-known algorithm for the problem.

Exercise 1.10
What is the complexity of the following problem?
▶ sorting an array O(n2 ) ✗
Best algorithm is
▶ matrix multiplication O(n3 ) ✗
still not known

Exercise 1.11
What is the best-known complexity for the above problems?

Commentary: A discussion on the latest developments in matrix multiplication algorithms. https://en.wikipedia.org/wiki/Computational_complexity_of_matrix_


multiplication
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 23
Θ-Notation
2g (n)

Definition 1.4 (Tight bound)


f (n)
Let f and g be functions N → N. We say f (n) ∈ Θ(g (n)) if there
are c1 , c2 , and n0 such that g (n)

c1 g (n) ≤ f (n) ≤ c2 g (n) for all n ≥ n0 .

n
n0

There are more variations of the above definition. Please look at the end.
Exercise 1.12
a. Does the worst-case complexity of BinarySearch belong to Θ(log n)?
b. If yes, give c1 , c2 , and n0 for the application of the above definition on BinarySearch.
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 24
Names of complexity classes

▶ Constant: O(1)
▶ Logarithmic: O(logn)
▶ Linear: O(n)
▶ Quadratic: O(n2 )
▶ Polynomial : O(nk ) for some given k
▶ Exponential : O(2n )

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 25
Topic 1.2

Tutorial Problems

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 26
Problem: Compute the exact running time of insertion sort.
Exercise 1.13
The following is the code for insertion sort. Compute the exact worst-case running time of the
code in terms of n and the cost of doing various machine operations.
for ( int j = 1; j < n ; j ++ ) {
int key = A [ j ];
int i = j -1;
while ( i >= 0 ) {
if ( A [ i ] > key ) {
A [ i +1] = A [ i ];
} else {
break ;
}
i - -;
}
A [ i +1] = key ;
cbna
} CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 27
Problem: additions and multiplication

Exercise 1.14
What is the time complexity of binary addition and multiplication? How much time does it take to
do unary addition?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 28
Problem: hierarchy of complexity

Exercise 1.15
Given f (n) = a0 n0 + ... + ad nd and g (n) = b0 n0 + ... + be ne with d > e and ad > 0 (Why?) , show
that f (n) ∈
/ O(g (n)).

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 29
Topic 1.3

Problems

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 30
True or False

Exercise 1.16
Mark the following statements True / False. Also provide justification.
1. For any function f : N → N, O(f ) ⊆ Ω(f ).
2. For a fixed array of size 2k for integer k, the binary search always takes the same amount of time in the case of an unsuccessful search.

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 31
Order of functions

Exercise 1.17
f (n) F (n)
▶ If f (n) ≤ F (n) and G (n) ≥ g (n) (in order sense) then show that ≤ .
G (n) g (n)
▶ Is f (n) the same order as f (n)|sin(n)|?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 32
Exercise: an important complexity class!

Exercise 1.18
Prove that O(log(n!)) = O(n log n). Hint: Stirling’s approximation

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 33
Exercise: egg drop problem**
Exercise 1.19
In the dead of night, a master jewel thief is plotting the heist of a lifetime-stealing the most valuable
Faberge Egg from a towering 100-story museum. Each floor of the building has an identical egg, but the
higher the floor, the more valuable the egg becomes. However, there’s a catch. The thief can steal only one
egg and she knows that the most valuable egg at the top may not survive a drop from such a great height.
To avoid smashing her prized loot, she must identify the highest floor from which an egg can be dropped
without breaking. Armed with two replica eggs from the museum’s gift shop-perfectly identical but utterly
worthless-the thief devises a plan. These two eggs will be her test subjects, sacrificed in the pursuit of the
perfect drop. But time is of the essence, and the thief can not afford to be caught by the museum guards.
She needs to figure out the minimum number of test drops required to guarantee finding the highest safe
floor. Once an egg is broken, it’s gone for good-no replacements, no second chances. She cannot use any
other method to determine the sturdiness of the eggs.
a. Give an algorithm for the thief to determine, with the least number of drops in the worst case, the
highest floor from which an egg can be safely dropped without breaking. (Quiz 2024)
b. Give an algorithm for the best average case, assuming that the probability of the highest safe floor is
uniformly distributed.
c. Prove optimality of your algorithm.***
Commentary: https://www.youtube.com/watch?v=NGtt7GJ1uiM
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 34
Identities for Big-O (Midsem 2024)

Definition 1.5
Let A and B be subsets of p(N → N). A + B = {f + g |f ∈ A ∧ g ∈ B}.

Exercise 1.20
Prove/Disprove the following:
▶ O(f )g ⊆ O(fg )
▶ O(f ) + O(g ) ⊆ O(f + g )

Exercise 1.21
Can we give examples when the above subset relations are strict?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 35
Topic 1.4

Extra slides: More on complexity

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 36
Ω notation

Definition 1.6 (Lower bound)


Let f and g be functions N → N. We say f (n) ∈ Ω(g (n)) if there are c and n0 such that

cg (n) ≤ f (n) for all n ≥ n0 .

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 37
Small-o,ω notation

Definition 1.7 (Strict Upper bound)


Let f and g be functions N → N. We say f (n) ∈ o(g (n)) if for each c, there is n0 such that

f (n) ≤ cg (n) for all n ≥ n0 .

Definition 1.8 (Strict Lower bound)


Let f and g be functions N → N. We say f (n) ∈ ω(g (n)) if for each c, there is n0 such that

cg (n) ≤ f (n) for all n ≥ n0 .

Exercise 1.22
a. Prove that f ∈ o(g ) implies f ∈ O(g ).
b. Show that f ∈ O(g ) does not imply f ∈ o(g ).
cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 38
Size of functions

We can define a partial order over functions using the above notations
▶ f (n) ∈ O(g (n)) implies f (n) ≤ g (n)
▶ f (n) ∈ o(g (n)) implies f (n) < g (n)
▶ f (n) ∈ Ω(g (n)) implies f (n) ≥ g (n)
▶ f (n) ∈ ω(g (n)) implies f (n) > g (n)
▶ f (n) ∈ Θ(g (n)) implies f (n) = g (n)

Exercise 1.23
Show that the partial order is well-defined.

Commentary: Why do we need to prove that the definition is well-defined?

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 39
End of Lecture 1

cbna CS213/293 Data Structure and Algorithms 2025 Instructor: Ashutosh Gupta IITB India 40

You might also like