0% found this document useful (0 votes)
811 views25 pages

PDSA Week 1

The document introduces Jupyter notebooks and Google Colab as tools for writing and running code. Jupyter notebooks allow users to write code in cells and run/update individual cells, supporting collaboration and documentation within a project. Google Colab provides a customized Jupyter notebook environment hosted online with preloaded machine learning packages and GPU hardware access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
811 views25 pages

PDSA Week 1

The document introduces Jupyter notebooks and Google Colab as tools for writing and running code. Jupyter notebooks allow users to write code in cells and run/update individual cells, supporting collaboration and documentation within a project. Google Colab provides a customized Jupyter notebook environment hosted online with preloaded machine learning packages and GPU hardware access.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Introduction to Jupyter notebooks and Google Colab

03 October 2021 00:32

Writing and running code


Manual
Text editor to write code
Run at the command line
Integrated Development Environment (IDE)
Single application to write and run code
On desktop or online, replit
Quick update-run cycle
Debugging, testing, …
What more could one want?

Collaboration
Share your code
Collaborative development
Report your results
Documentation
Interleave with the code
Switch between different versions of code
Export and import your project
Preserve your output

Jupyter notebook
A sequence of cells
Like a one dimensional spreadsheet
Cells hold code or text
Markdown notation for formatting
https://www.markdownguide.org/
Edit and re-run individual cells to update environment
Supports different kernels
Julia, Python, R
We will use it only for Python
Widely used to document and disseminate ML projects
Solutions to problems posed on platforms like Kaggle
https://www.kaggle.org
Won ACM Software Systems Award 2017

Google Colab
Google Colaboratory (Colab)
Colab.research.google.com
Free to use
Similar to jupyter notebook, online
Customized Jupyter notebook
All standard packages required for ML are preloaded
scikit-learn, tensorflow
Access to GPU hardware

Week 1 Page 1
Python Recap - I
03 October 2021 00:43

Computing gcd
gcd(m, n) - greatest common divisor
Largest k that divides both m and n
gcd(8, 12) = 4
gcd(18, 25) = 1
Also hcf - highest common factor
gcd(m, n) always exists
1 divides both m and n
Computing gcd(m, n)
gcd(m, n) <= min(m, n)
Compute list of common factors from 1 to min(m, n)
Return the last such common factor
Code
def gcd(m, n):
cf = [] #List of common factors
for i in range(1,min(m,n)+1):
if (m%i) == 0 and (n%i) == 0:
cf.append(i)
return(cf[-1])

Points to note
Need to initialize cf for cf.append() to work
Variables (names) derive their type from the value they hold
Control flow
Conditionals (if)
Loops (for)
range(i,j) runs from i to j-1
List indices run from 0 to len(l) - 1 and backwards from -1 to -len(l)
Eliminate the list
Since only last element of cf is needed / important
Keep track of most recent common factor (mrcf)
Recall that 1 is always a common factor
No need to initialize mrcf
Code
def gcd(m, n):
for i in range(1,min(m,n)+1):
if (m%i) == 0 and (n%i) == 0:
mrcf = i
return(mrcf)
Efficiency
Both versions of gcd take time proportional to min(m,n)
Can we do better?

Week 1 Page 2
Python Recap - II
03 October 2021 00:59

Checking primality
A prime number n has exactly two factors, 1 and n
Note that 1 is not a prime
Compute the list of factors of n
n is a prime if the list of factors is precisely [1,n]

Code
def factors(n):
fl = [] # factor list
for I in range(1,n+1):
if (n%i) == 0:
fl.append(i)
return(fl)

def prime(n):
return(factors(n) == [1,n])

Counting primes
List all primes up to m

def primesupto(m):
pl = [] # prime list
for i in range(1,m+1):
if prime(i):
pl.append(i)
return(pl)

List the first m primes


Multiple simultaneous assignment

def firstprimes(m):
(count,I,pl) = (0,1,[])
while (count < m):
if prime(i):
(count,pl) = (count+1,pl+[i])
i=i+1
return(pl)

for vs while
Is the number of iterations known in advance?
Ensure progress to guarantee termination of while

Computing primes
Directly check if n has a factor between 2 and n-1

def prime(n):
result = True

Week 1 Page 3
result = True
for i in range(2,n):
if (n%i) == 0:
result = False
return(result)

Terminate check after we find first factor


Breaking out of a loop

def prime(n):
result = True
for i in range(2,n):
if (n%i) == 0:
result = False
break # Abort loop
return(result)

Alternatively, use while

def prime(n):
(result,i) = (True,2)
while (result and (i < n)):
if (n%i) == 0:
result = False
i=i+1
return(result)

Speeding things up slightly


Factors occur in pairs
Sufficient to check factors up to sqrt(n)
If n is prime, scan 2,…,sqrt(n) instead of 2,…,n-1

Import math
def prime(n):
(result,i) = (True,2)
while (result and (i < math.sqrt(n))):
if (n%i) == 0:
result = False
i=i+1
return(result)

Properties of primes
There are infinitely many primes
How are they distributed?
Twin primes: p, p + 2
In general, 2^k - 1 and 2^k + 1
Odd in general
Twin prime conjecture
There are infinitely many twin primes?
Compute the differences between primes
Use a dictionary
Key - difference
Value - frequency
Start checking from 3, since 2 is the smallest prime

Week 1 Page 4
Start checking from 3, since 2 is the smallest prime

def primediffs(n):
lastprime = 2
pd = {} # Dictionary for prime differences
for i in range(3,n+1):
if prime(i):
d = i - lastprime
lastprime = I
if d in pd.keys():
pd[d] = pd[d] + 1
else:
pd[d] = 1
return(pd)

Week 1 Page 5
Python Recap - III
03 October 2021 01:27

Computing gcd
Can we do better?
Till now, the process is like kind of brute force for gcd. This is called naïve approach.

Suppose d divides m and n


m = ad, n = bd
m - n = (a - b)d
d also divides m - n
Recursively defined function
Base case: n divides m, answer is n
Otherwise, reduce gcd(m,n) to gcd(n,m-n)

def gcd(m,n):
(a,b) = (max(m,n),min(m,n))
if a%b == 0:
return(b)
else:
return(gcd(b,a-b))

Unfortunately, this takes time proportional to max(m,n)


Still better algorithm?

Euclid's algorithm
Suppose n does not divide m
Then m = qn + r
Suppose d divides both m and n
Then m = ad, n = bd
m = qn + r => ad = q(bd) + r
r must be of the form cd
Euclid's algorithm
If n divides m, gcd(m,n) = n
Otherwise, compute gcd(n,m mod n)

def gcd(m,n):
(a,b) = (max(m,n),min(m,n))
if a%b == 0:
return(b)
else:
return(gcd(b,a%b))

gcd(n, m%n) instead of gcd(n, m-n)


Can show that this takes time proportional to number of digits in max(m,n)
One of the first non-trivial algorithms

Week 1 Page 6
Exception handling
03 October 2021 10:32

When things go wrong


Our code could generate many types of errors
y = x/z, but z has value 0
y = int(s), but string s does not represent a valid integer
y = 5*x, but x does not have a value
y = l[i], but i is not a valid index for list l
Try to read from a file, but the file does not exist
Try to write to a file, but the disk is full

Recovering gracefully
Try to anticipate errors
Provide a contingency plan
Exception handling

Types of errors
Python flags the type of each error
Most common error is a syntax error
SyntaxError: invalid syntax
Not much you can do!
We are interested in errors when the code is running
Name used before value is defined
NameError: name 'x' is not defined
Division by zero in arithmetic expression
ZeroDivisionError: division by zero
Invalid list index
IndexError: list assignment index out of range
KeyError for dictionary keys

Terminology
Raise an exception
Run time error => signal error type, with diagnostic information
NameError: name 'x' is not defined
Handle an exception
Anticipate and take corrective action based on error type
Unhandled exception aborts execution

Handling exceptions
try:

… # Code where error may occur
except IndexError:
… # Handle IndexError
except (NameError,KeyError):
… # Handle multiple exxception types
except:
… # Handle all other exceptions

Week 1 Page 7
… # Handle all other exceptions
else:
… # Execute if try runs without errors

Using exceptions "positively"


Collect scores in dictionary
scores = {"Shefali":[3,22],
"Harmanpreet":[200,3]}
Update the dictionary
Batter b already exists, append to list
scores[b].append(s)
New batter, create a fresh entry
scores[b] = [s]

Traditional approach
if b in scores.keys():
scores[b].append(s)
else:
scores[b] = [s]

Using exceptions
try:
scores[b].append(s)
except KeyError:
scores[b] = [s]

Flow of control
The error raised anywhere will be passed back. For example, assume a function f(x,y) calls g(x) internally,
and g(x) calls h(x) internally. If IndexError is raised by h(), this error will be passed back to g() and then
g() passes this error to f(). So, using try except codes, we can handle these exceptions easily.

Week 1 Page 8
Classes and Objects
03 October 2021 12:15

Abstract datatype
Stores some information
Designated functions to manipulate the information
For instance, stack: last-in, first-out, push(), pop()

Separate the (private) implementation from the (public) specification

Class
Template for a data type
How data is stored
How public functions manipulate data

Object
Concrete instance of template

Example: 2D points
A point has coordinates (x, y)
__init__() initializes internal values x, y
First parameter is always self
Here, by default a point is at (0, 0)
Translation: shift a point by (delta x, delta y)
(x, y) => (x + deltax, y + deltay)
Distance from the origin
d = sqrt(x^2 + y^2)

class Point:
def __init__(self,a=0,b=0):
self.x = a
self.y = b
def translate(self,deltax,deltay):
self.x += deltax
self.y += deltay
def odistance(self):
import math
d = math.sqrt(self.x*self.x + self.y*self.y)
return(d)

Example: Polar coordinates


(r, theta) instead of (x, y)
r = sqrt(x^2 + y^2)
theta = tan^-1(y/x)
Distance from origin is just r
Translation
Convert (r, theta) to (x, y)
x = r.cos(theta), y = r.sin(theta)
Recompute r, theta from (x + deltax, y + deltay)
Interface has not changed

Week 1 Page 9
Interface has not changed
User need not be aware whether representation is (x, y) or (r, theta)

import math
class Point:
def __init__(self,a=0,b=0):
self.r = math.sqrt(a*a + b*b)
if a == 0:
self.theta = math.pi/2
else:
self.theta = math.atan(b/a)
def odistance(self):
return(self.r)
def translate(self,deltax,deltay):
x = self.r*math.cos(self.theta)
y = self.r*math.sin(self.theta)
x += deltax
y += deltay
self.r = math.sqrt(x*x + y*y)
if x == 0:
self.theta = math.pi/2
else:
self.theta = math.atan(y/x)

Special functions
__init__() - constructor
__str__() - convert object to string
str(o) == o.__str__()
Implicitly invoked by print()
__add__()
Implicitly invoked by +
__mult__() invoked by *
__lt__() invoked by <
__ge__() invoked by >=

Week 1 Page 10
Timing our code
03 October 2021 12:59

How long does our code take to execute?


Depends from one language to another
Python has a library time with various useful function
perf_time() is a performance counter
Absolute value of perf_time() is not meaningful
Compare two consecutive readings to get an interval
Default unit is seconds

import time
start = time.perf_counter()

# Execute some code

end = time.perf_counter()
elapsed = end - start

A timer object
Create a timer class
Two internal values
_start_time
_elapsed_time
start starts the timer
stop records the elapsed time
More sophisticated version in the actual code
Python executes 10^7 operations per second where C++ can be even faster with 10^8 operations per
second.
import time
class Timer:
def __init__(self):
self._start_time = 0
self._elapsed_time = 0
def start(self):
self._start_time = time.perf_counter()
def stop(self):
self._elapsed_time = time.perf_counter() - self._start_time
def elapsed(self):
return(self._elapsed_time)

Week 1 Page 11
Why Efficiency matters?
03 October 2021 13:07

A real world problem


Every SIM card needs to be linked to an Aadhar card
Validate the Aadhar details for each SIM card

Simple nested loop


for each SIM card S:
for each Aadhar number A:
check if Aadhar details of S match A

How long will the validation process take with nested loop?
M SIM cards, N Aadhar cards
Nested loops iterate M*N times
What are M and N
Almost everyone in India has an Aadhar card: N > 10^9
Number of SIM cards registered is similar: M > 10^9
Assume M = N = 10^9
Nested loops execute 10^18 times
We calculated previously that Python can perform 10^7 operations in a second
This takes at least 10^11 seconds
10^11 / 60 = 1.6667E9 minutes
1.6667E9 / 60 = 2.7778E7 hours
2.7778E7 / 24 = 1.1574E6 days
1.1574E6 / 365 = 3,170.9589 years!
How can we fix this?
Guess my birthday
You propose a date
I answer, Yes, Earlier, Later
Suppose my birthday is 12 April
A possible sequence of questions
September 12? Earlier
February 23? Later
July 2? Earlier

What is the best strategy?
Interval of possibilities
Query midpoint - halves the interval
June 30? Earlier
March 31? Later
May 15? Earlier
April 22? Earlier
April 11? Later
April 16? Earlier
April 13? Earlier
April 12? Yes
Instead of 365 iterations all over the year, just 8 iterations solved the problem, by halving the size of
iterable.

Back to Aadhar and SIM cards

Week 1 Page 12
Back to Aadhar and SIM cards
Assume Aadhar details are sorted by Aadhar number
Use the halving strategy to check SIM card
Halving 10 times reduces the interval by a factor of 1000, because 2^10 = 1,024
After 10 queries, interval shrinks to 10^6
After 20 queries, interval shrinks to 10^3
After 30 queries, interval shrinks to 1
Total operations = 10^9 * 30 = 3E10
Time = 100 * 30 = 3,000 seconds = 50 minutes
From 3200 years to 50 minutes!
Of course, to achieve this, we have to first sort the Aadhar cards
Arranging the data results in a much more efficient solution
Both algorithms and data structures matter

for each SIM card S:


probe sorted Aadhar list to check Aadhar details of S

Week 1 Page 13
Programming Assignments
06 October 2021 22:26

PPA 1
Twin primes are pairs of prime numbers that differ by 2. For example (3, 5), (5, 7), and (11,13) are twin
primes.
Write a function Twin_Primes(n, m) where n and m are positive integers and n < m , that returns all
unique twin primes between m and n (both inclusive). The function returns a list of tuples and each
tuple (a,b) represents one unique twin prime where n <= a < b <= m.

Code:
def is_prime(x):
if x == 1:
return False
elif x == 2:
return True
else:
prime = True
for i in range(2,x):
if x%i == 0:
prime = False
break
return prime

def Twin_Primes(n,m):
tp = []
for i in range(n,m+1):
if is_prime(i) and is_prime(i+2) and i+2 <= m:
tp.append((i,i+2))
return tp

n=int(input())
m=int(input())
print(sorted(Twin_Primes(n, m)))

PPA 2

Week 1 Page 14
Code:
class Triangle:
def __init__(self,a,b,c):
self.a = a
self.b = b
self.c = c

def is_valid(self):
if self.a + self.b > self.c and self.a + self.c > self.b and self.b + self.c > self.a:
return "Valid"
else:
return "Invalid"

def Side_Classification(self):
if self.is_valid() == "Invalid":
return "Invalid"
else:
if self.a == self.b == self.c:
return "Equilateral"
elif self.a == self.b or self.b == self.c or self.c == self.a:
return "Isosceles"
else:
return "Scalene"

def Angle_Classification(self):
if self.is_valid() == "Invalid":
return "Invalid"
else:
sides = sorted([self.a,self.b,self.c])
if sides[0]**2 + sides[1]**2 > sides[2]**2:
return "Acute"
elif sides[0]**2 + sides[1]**2 == sides[2]**2:
return "Right"
else:
return "Obtuse"

def Area(self):
if self.is_valid() == "Invalid":
return "Invalid"
else:
s = (self.a+self.b+self.c)/2
return (s*(s-a)*(s-b)*(s-c))**0.5

Week 1 Page 15
return (s*(s-a)*(s-b)*(s-c))**0.5
a=int(input())
b=int(input())
c=int(input())
T=Triangle(a,b,c)
print(T.is_valid())
print(T.Side_Classification())
print(T.Angle_Classification())
print(T.Area())

GrPA 1

Code:
def find_Min_Difference(L,P):
l = sorted(L)
m = 9999999999999
for i in range(len(l)-P+1):
c = abs(l[i]-l[i+P-1])
if c < m:
m=c
return m
L=eval(input().strip())
P=int(input())
print(find_Min_Difference(L,P))

Solution:
def find_Min_Difference(L,P):
L.sort()
N=P
M = len(L)
min_diff = max(L) - min(L)
for i in range(M-N+1):

Week 1 Page 16
for i in range(M-N+1):
if L[i+N-1] - L[i] < min_diff:
min_diff = L[i+N-1] - L[i]
return min_diff
L=eval(input().strip())
P=int(input())
print(find_Min_Difference(L,P))

GrPA 2

Code:
def is_prime(n):
if n == 1:
return False
elif n == 2:
return True
else:
for i in range(2,n):
if n%i == 0:
return False
return True
def primes(n):
l = []
for i in range(2,n+1):
if is_prime(i):
l.append(i)
return l
def Goldbach(n):
l = primes(n)
result = []
for i in l:

Week 1 Page 17
for i in l:
for j in l:
if i + j == n and (i,j) not in result and (j,i) not in result:
result.append((i,j))
return result
n=int(input())
print(sorted(Goldbach(n)))

Solution:
def prime(n):
if n < 2:
return False
for i in range(2,n//2+1):
if n%i==0:
return False
return True

def Goldbach(n):
Res=[]
for i in range((n//2)+1):
if prime(i)==True:
if prime(n-i)==True:
Res.append((i,n-i))
return(Res)
n=int(input())
print(sorted(Goldbach(n)))

GrPA 3

Code:
def odd_one(L):
d = {int:0,str:0,bool:0,float:0}
for i in L:
d[type(i)] += 1
for i in d:
if d[i] == 1:
return str(i)[8:-2]
print(odd_one(eval(input().strip())))

Solution:
def odd_one(L):
P = {}

Week 1 Page 18
P = {}
for elem in L:
if type(elem) not in P:
P[type(elem)] = 0
P[type(elem)] += 1
for key, value in P.items():
if value == 1:
return key.__name__
print(odd_one(eval(input().strip())))

Week 1 Page 19
Assignments
06 October 2021 22:33

Practice Assignment
1. If n is a positive integer then which of the following statement is correct about function check ?

a. Function check returns True if the number of digits in n is odd.

b. Function check returns True if all digits of n are even.

c. Function check returns True if all digits of n are odd.

d. Function check returns True if the number of digits in n is even.

2. What will be the value of i after execution of the given code-snippet?

a. 2

b. 5

c. 4

d. Code has error

3. Match the following?

a. A-(i) B-(ii) C-(iii) D-(iv)

b. A-(ii) B-(i) C-(iv) D-(iii)

c. A-(iii) B-(i) C-(ii) D-(iv)

d. A-(ii) B-(iii) C-(iv) D-(i)

Week 1 Page 20
d. A-(ii) B-(iii) C-(iv) D-(i)

4. What is the output of given code?

a.

b.

c.

d.

Accepted Answers:

5. Which of the following options will validate whether n is a perfect square or not? Where n is a
positive integer. [MSQ]
a.

b.

c.

Week 1 Page 21
d.

Accepted Answers:

Graded Assignment
1. S is a non-empty string of English letters without any space. What fun(S) will return after
execution of the above code?

a. Total number of letters in the string S .

b. Total number of distinct letters in the string S .

c. Total number of letters that are repeated in the string S more than one time.

d. Difference of total letters in the string S and distinct letters in the string S .

2. Which of the following is/are valid reason(s) for NameError exception? [MSQ]
a. Variable is not defined.

b. Calling a function before declaration.

Week 1 Page 22
b. Calling a function before declaration.

c. Variable name spelt incorrectly.

d. Variables are defined globally in the program .

3. What is f(60) - f(59) , given the definition of f above?

Accepted Answers:

(Type: Numeric) 3
4. What will be the output of the above code-snippet?

a. Syntax error

b. 2 1

c. 0 3 1

d. None of these

5. What will be the output of the above code-snippet?

a. Good morning

b. Hello, Good morning

c. Hello Good morning

d. Good

6. What will be the output of the above code-snippet?

a.

Week 1 Page 23
b.

c.

d.

Accepted Answers:

7. Given above is a function that checks whether a list satisfies some property. There is an error in
this function. Select the list(s) L = [n1, n2, n3] , where n1 , n2 and n3 are all integers, for which
special3Bad(L) produces a ZeroDivisionError exception. [MSQ]

a. L = [4, 2, 8]

b. L = [4, 2, 4]

c. L = [8, 4, 16]

d. L = [48, 6, 36]

e. L = [44, 6, 36]

8. Given above is a function to check whether a list is a palindrome. There is an error in this function.
Select the list(s) L = [n1, n2,..., n2, n1] , for which isSymmetricBad(L) produces an IndexError
exception. [MSQ]

Week 1 Page 24
a. L = [1, 2, 3, 4, 3, 2, 1]

b. L = [2, 2, 2, 2, 2, 2]

c. L = [1, 1, 1, 1, 1, 1, 1]

d. L = [8]

e. L = [2, 4, 6]

9. How many times gcd() function will be called?


Note: Ignore the first call given in the code.

Accepted Answers:

(Type: Numeric) 3
10. Which of the following option(s) is/are correct about the given code? [MSQ] count , name and
course are object variables.

a. name and course are class variable and count is an object variable.

b. name and course are object variables, and count is a class variable.

c. count , name and course are class variables.

d. count represents the number of objects created for class Enrollment

Week 1 Page 25

You might also like