Advancedpythontips Updated
Advancedpythontips Updated
Rahul Agarwal
This book is for sale at http://leanpub.com/advancedpythontips
This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Creating a Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
But, still, some problems remain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Some More Info: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
However, learning to write a language and writing a language in an optimized way are two
different things.
Every Language has some ingredients which make it unique.
Yet, a new programmer to any language will always do some forced overfitting. A Java
programmer, new to python, for example, might write this code to add numbers in a list.
1 x=[1,2,3,4,5]
2 sum_x = 0
3 for i in range(len(x)):
4 sum_x+=x[i]
1 sum_x = sum(x)
In this book, I will explain some simple constructs provided by Python, some essential tips, and some
use cases I come up with regularly in my Data Science work. Most of the book is of a practical nature
and you will find it beaming with examples.
This book is about efficient and readable code.
If you like this book, I would appreciate it if you could buy* the paid version here.
*htps://somehyperlink.com
Chapter 1: Minimize for loop usage in
Python
There are many ways to write a for loop in Python. A beginner may get confused on what to use.
Let me explain this with a simple example statement.
Suppose you want to take the sum of squares in a list.
This is a valid problem we all face in machine learning whenever we want to calculate the distance
between two points in n dimension.
You can do this using loops easily.
In fact, I will show you three ways to do the same task which I have seen people use and let
you choose for yourself which you find the best.
1 x = [1,3,5,7,9]
2 sum_squared = 0
3
4 for i in range(len(x)):
5 sum_squared+=x[i]**2
Whenever I see the above code in a python codebase, I understand that the person has come from
C or Java background.
A slightly more pythonic way of doing the same thing is:
1 x = [1,3,5,7,9]
2 sum_squared = 0
3
4 for y in x:
5 sum_squared+=y**2
Better.
I didn’t index the list. And my code is more readable.
But still, the pythonic way to do it is in one line.
Chapter 1: Minimize for loop usage in Python 5
1 x = [1,3,5,7,9]
2 sum_squared = sum([y**2 for y in x])
This approach is called List Comprehension, and this may very well be one of the reasons
that I love Python.
You can also use if in a list comprehension.
Let’s say we wanted a list of squared numbers for even numbers only.
1 x = [1,2,3,4,5,6,7,8,9]
2 even_squared = [y**2 for y in x if y%2==0]
3 --------------------------------------------
4 [4,16,36,64]
if-else?
What if we wanted to have the number squared for even and cubed for odd?
1 x = [1,2,3,4,5,6,7,8,9]
2 squared_cubed = [y**2 if y%2==0 else y**3 for y in x]
3 --------------------------------------------
4 [1, 4, 27, 16, 125, 36, 343, 64, 729]
Great!!!
Chapter 1: Minimize for loop usage in Python 6
So basically follow specific guidelines: Whenever you feel like writing a for statement, you should
ask yourself the following questions,
What is enumerate?
Sometimes we need both the index in an array as well as the value in an array.
In such cases, I prefer to use enumerate rather than indexing the list.
Chapter 1: Minimize for loop usage in Python 7
1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 for k in x}
3 ---------------------------------------------------------
4 {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}
1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 for k in x if x%2==0}
3 ---------------------------------------------------------
4 {2: 4, 4: 16, 6: 36, 8: 64}
What if we want squared value for even key and cubed number for the odd key?
1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 if k%2==0 else k**3 for k in x}
3 ---------------------------------------------------------
4 {1: 1, 2: 4, 3: 27, 4: 16, 5: 125, 6: 36, 7: 343, 8: 64, 9: 729}
Chapter 1: Minimize for loop usage in Python 8
Conclusion
To conclude, I will say that while it might seem easy to transfer the knowledge you acquired from
other languages to Python, you won’t be able to appreciate the beauty of Python if you keep doing
that. Python is much more powerful when we use its ways and decidedly much more fun.
So, use List Comprehensions and Dict comprehensions when you need afor loop. Use enumer-
ate if you need array index.
Your code will be much more readable and maintainable in the long run.
Chapter 2: Python defaultdict and
Counter
Let’s say I need to count the number of word occurrences in a piece of text. Maybe for a book
like Hamlet. How could I do that?
Python always provides us with multiple ways to do the same thing. But only one way that I find
elegant.
This is a Naive Python implementation using the dict object.
Chapter 2: Python defaultdict and Counter 10
1 text = "I need to count the number of word occurrences in a piece of text. How could\
2 I do that? Python provides us with multiple ways to do the same thing. But only one
3 way I find beautiful."
4
5 word_count_dict = {}
6 for w in text.split(" "):
7 if w in word_count_dict:
8 word_count_dict[w]+=1
9 else:
10 word_count_dict[w]=1
If we use Counter, we can also get the most common words using a simple function.
1 word_count_dict.most_common(10)
2 ---------------------------------------------------------------
3 [('I', 3), ('to', 2), ('the', 2)]
1 # Count Characters
2 Counter('abccccccddddd')
3 ---------------------------------------------------------------
4 Counter({'a': 1, 'b': 1, 'c': 6, 'd': 5})
5
6 # Count List elements
7 Counter([1,2,3,4,5,1,2])
8 ---------------------------------------------------------------
9 Counter({1: 2, 2: 2, 3: 1, 4: 1, 5: 1})
1 d = defaultdict(set)
2
3 for k, v in s:
4 d[k].add(v)
5
6 print(d)
7 ---------------------------------------------------------------
8 defaultdict(<class 'set'>, {'color': {'yellow', 'blue', 'orange'}, 'fruit': {'banana\
9 ', 'orange'}})
Conclusion
To conclude, I will say that there is always a beautiful way to do anything in Python. Search
for it before you write code. Going to StackOverflow is okay. I go there a lot of times when I get
stuck. Always Remember:
A simple example:
Let us say we have to create a function that adds two numbers. We can do this easily in python.
1 def adder(x,y):
2 return x+y
1 def adder(x,y,z):
2 return x+y+z
What if we want the same function to add an unknown number of variables? Please note that we
can use *args or *argv or *anyOtherName to do this. It is the * that matters.
1 def adder(*args):
2 result = 0
3 for arg in args:
4 result+=arg
5 return result
What *args does is that it takes all your passed arguments and provides a variable length argument
list to the function which you can use as you want.
Now you can use the same function as follows:
1 adder(1,2)
2 adder(1,2,3)
3 adder(1,2,5,7,8,9,100)
and so on.
Now, have you ever thought how the print function in python could take so many arguments?
*args
Chapter 3: *args, **kwargs, decorators for Data Scientists 15
In simple terms,you can use **kwargs to give an arbitrary number of Keyworded inputs to your
function and access them using a dictionary.
A simple example:
Let’s say you want to create a print function that can take a name and age as input and print that.
1 def myprint(name,age):
2 print(f'{name} is {age} years old')
Simple. Let us now say you want the same function to take two names and two ages.
1 def myprint(name1,age1,name2,age2):
2 print(f'{name1} is {age1} years old')
3 print(f'{name2} is {age2} years old')
You guessed right my next question is: What if I don’t know how many arguments I am going
to need?
Can I use *args? Guess not since name and age order is essential. We don’t want to write “28 is
Michael years old”.
Come **kwargs in the picture.
1 def myprint(**kwargs):
2 for k,v in kwargs.items():
3 print(f'{k} is {v} years old')
1 myprint(Sansa=20,Tyrion=40,Arya=17)
Chapter 3: *args, **kwargs, decorators for Data Scientists 16
1 Output:
2 -----------------------------------
3 Sansa is 20 years old
4 Tyrion is 40 years old
5 Arya is 17 years old
In simple terms: Decorators are functions that wrap another function thus modifying its
behavior.
A simple example:
Let us say we want to add custom functionality to some of our functions. The functionality is that
whenever the function gets called the “function name begins” is printed and whenever the function
ends the “function name ends” and time taken by the function is printed.
Let us assume our function is:
1 def somefunc(a,b):
2 output = a+b
3 return output
We can add some print lines to all our functions to achieve this.
Chapter 3: *args, **kwargs, decorators for Data Scientists 18
1 import time
2 def somefunc(a,b):
3 print("somefunc begins")
4 start_time = time.time()
5 output = a+b
6 print("somefunc ends in ",time.time()-start_time, "secs")
7 return output
8
9 out = somefunc(4,5)
1 OUTPUT:
2 -------------------------------------------
3 somefunc begins
4 somefunc ends in 9.5367431640625e-07 secs
This is how we can define any decorator. functools helps us create decorators using wraps. In essence,
we do something before any function is called and do something after a function is called in the above
decorator.
We can now use this timer decorator to decorate our function somefunc
1 @timer
2 def somefunc(a,b):
3 output = a+b
4 return output
1 a = somefunc(4,5)
1 Output
2 ---------------------------------------------
3 'somefunc' begins
4 'somefunc' ends in 2.86102294921875e-06 secs
Now we can append @timer to each of our function for which we want to have the time printed.
And we are done.
Really?
Now our function can take any number of arguments, and our decorator will still work.
In my view, decorators could be pretty helpful. I provided only one use case of decorators, but there
are several ways one can use them.
You can use a decorator to debug code by checking which arguments go in a function. Or a decorator
could be used to count the number of times a particular function has been called. This could help
with counting recursive calls.
Conclusion
In this chapter, I talked about some of the constructs you can find in python source code and how
you can understand them.
It is not necessary that you end up using them in your code now. But I guess understanding how these
things work helps mitigate some of the confusion and panic one faces whenever these constructs
come up.
Chapter 4: Use Iterators, Generators,
and Generator Expressions
Python in many ways has made our life easier when it comes to programming.
With its many libraries and functionalities, sometimes we forget to focus on some of the useful
things it offers.
One of such functionalities are generators and generator expressions. I stalled learning about them
for a long time but they are useful.
Have you ever encountered yield in Python code and didn’t knew what it meant? or what does an
iterator or a generator means and why we use it? Or have you used ImageDataGenerator while
working with Keras and didn’t understand what is going at the backend? Then this chapter is for
you.
Chapter 4: Use Iterators, Generators, and Generator Expressions 22
Let us say that we need to run a for loop over 10 Million Prime numbers.
I am using prime numbers in this case for understanding but it could be extended to a case where
we have to process a lot of images or files in a database or big data.
How would you proceed with such a problem?
Simple. We can create a list and keep all the prime numbers there.
Really? Think of the memory such a list would occupy.
It would be great if we had something that could just keep the last prime number we have checked
and returns just the next prime number.
That is where iterators could help us.
Chapter 4: Use Iterators, Generators, and Generator Expressions 23
1 def check_prime(number):
2 for divisor in range(2, int(number ** 0.5) + 1):
3 if number % divisor == 0:
4 return False
5 return True
6
7 class Primes:
8 def __init__(self, max):
9 # the maximum number of primes we want generated
10 self.max = max
11 # start with this number to check if it is a prime.
12 self.number = 1
13 # No of primes generated yet. We want to StopIteration when it reaches max
14 self.primes_generated = 0
15 def __iter__(self):
16 return self
17 def __next__(self):
18 self.number += 1
19 if self.primes_generated >= self.max:
20 raise StopIteration
21 elif check_prime(self.number):
22 self.primes_generated+=1
23 return self.number
24 else:
25 return self.__next__()
1 prime_generator = Primes(10000000)
2
3 for x in prime_generator:
4 # Process Here
Here I have defined an iterator. This is how most of the functions like xrange or ImageGenerator
work.
Every iterator needs to have:
Every iterator takes the above form and we can tweak the functions to our liking in this boilerplate
code to do what we want to do.
See that we don’t keep all the prime numbers in memory just the state of the iterator like
Put simply Generators provide us ways to write iterators easily using the yield statement.
Chapter 4: Use Iterators, Generators, and Generator Expressions 25
1 def Primes(max):
2 number = 1
3 generated = 0
4 while generated < max:
5 number += 1
6 if check_prime(number):
7 generated+=1
8 yield number
1 prime_generator = Primes(10)
2 for x in prime_generator:
3 # Process Here
While not explicitly better than the previous solution but we can also use Generator expression for
the same task. But we might lose some functionality here. They work exactly like list comprehensions
but they don’t keep the whole list in memory.
Functionality loss: We can generate primes till 10M. But we can’t generate 10M primes. One can
only do so much with generator expressions.
But generator expressions let us do some pretty cool things.
Let us say we wanted to have all Pythagorean Triplets lower than 1000.
How can we get it?
Chapter 4: Use Iterators, Generators, and Generator Expressions 27
1 triplet_generator = triplet(1000)
2
3 for x in triplet_generator:
4 print(x)
5
6 ------------------------------------------------------------
7 (5, 4, 3)
8 (10, 8, 6)
9 (13, 12, 5)
10 (15, 12, 9)
11 .....
Conclusion
We must always try to reduce the memory footprint in Python. Iterators and generators provide
us with a way to do that with Lazy evaluation.
How do we choose which one to use? What we can do with generator expressions we could have
done with generators or iterators too.
There is no correct answer here. Whenever I face such a dilemma, I always think in the terms of
functionality vs readability. Generally,
Functionality wise: Iterators>Generators>Generator Expressions.
Readability wise: Iterators<Generators<Generator Expressions.
It is not necessary that you end up using them in your code now. But I guess understanding how these
things work helps mitigate some of the confusion and panic one faces whenever these constructs
come up.
Chapter 5: How and Why to use f
strings in Python3?
Python provides us with many styles of coding. And with time, Python has regularly come up with
new coding standards and tools that adhere even more to the coding standards in the Zen of Python.
And so this chapter is about using f strings in Python that was introduced in Python 3.6.
1 name = 'Andy'
2 age = 20
3 print(?)
4 ----------------------------------------------------------------
5 Output: I am Andy. I am 20 years old
1 name = 'Andy'
2 age = 20
3 print("I am " + name + ". I am " + str(age) + " years old")
4 ----------------------------------------------------------------
5 I am Andy. I am 20 years old
Chapter 5: How and Why to use f strings in Python3? 30
b) % Format: The second option is to use % formatting. But it also has its problems. For one, it is not
readable. You would need to look at the first %s and try to find the corresponding variable in the list
at the end. And imagine if you have a long list of variables that you may want to print.
c) str.format(): Next comes the way that has been used in most Python 3 codes and has become
the standard of printing in Python. Using str.format()
Here we use {} to denote the placeholder of the object in the list. It still has the same problem of
readability, but we can also use str.format :
1 data = {'name':'Andy','age':20}
2 print("I am {name}. I am {age} years old".format(**data))
Since Python 3.6, we have a new formatting option, which makes it even more trivial. We could
simply use:
We just append f at the start of the string and use {} to include our variable name, and we get the
required results.
An added functionality that f string provides is that we can put expressions in the {} brackets. For
Example:
Chapter 5: How and Why to use f strings in Python3? 32
1 num1 = 4
2 num2 = 5
3 print(f"The sum of {num1} and {num2} is {num1+num2}.")
4 ---------------------------------------------------------------
5 The sum of 4 and 5 is 9.
This is quite useful as you can use any sort of expression inside these brackets. The expression can
contain dictionaries or functions. A simple example:
1 def totalFruits(apples,oranges):
2 return apples+oranges
3
4 data = {'name':'Andy','age':20}
5
6 apples = 20
7 oranges = 30
8
9 print(f"{data['name']} has {totalFruits(apples,oranges)} fruits")
10 ----------------------------------------------------------------
11 Andy has 50 fruits
1 num1 = 4
2 num2 = 5
3 print(f'''The sum of
4 {num1} and
5 {num2} is
6 {num1+num2}.''')
7
8 ---------------------------------------------------------------
9 The sum of
10 4 and
11 5 is
12 9.
An everyday use case while formatting strings is to format floats. You can do that using f string as
following
Chapter 5: How and Why to use f strings in Python3? 33
1 numFloat = 10.23456678
2 print(f'Printing Float with 2 decimals: {numFloat:.2f}')
3
4 -----------------------------------------------------------------
5 Printing Float with 2 decimals: 10.23
Conclusion
Until recently, I had been using Python 2 for all my work, and so was not able to check out this new
feature.
But now, as I am shifting to Python 3, f strings has become my go-to syntax to format strings. It is
easy to write and read with the ability to incorporate arbitrary expressions as well. In a way, this
new function adheres to at least 3 PEP* concepts —
Beautiful is better than ugly, Simple is better than complex and Readability counts.
*https://www.python.org/dev/peps/pep-0020/
Chapter 6: Object Oriented
Programming
Object-Oriented Programming or OOP can be a tough concept to understand for beginners. And
that’s mainly because it is not really explained in the right way in a lot of places. Normally a lot of
books start by explaining OOP by talking about the three big terms — Encapsulation, Inheritance
and Polymorphism. But the time the book can explain these topics, anyone who is just starting
would already feel lost.
So, I thought of making the concept a little easier for fellow programmers, Data Scientists and
Pythonistas. The way I intend to do is by removing all the Jargon and going through some examples.
I would start by explaining classes and objects. Then I would explain why classes are important in
various situations and how they solve some fundamental problems. In this way, the reader would
also be able to understand the three big terms by the end of the chapter.
This chapter is about explaining OOP the laymen way.
1 a = 2
2 b = "Hello!"
We are creating an object a of class int holding the value 2 And and object b of class str holding
the value “Hello!”. In a way, these two particular classes are provided to us by default when we use
numbers or strings.
Apart from these a lot of us end up working with classes and objects without even realizing it. For
example, you are actually using a class when you use any scikit Learn Model.
1 clf = RandomForestClassifier()
2 clf.fit(X,y)
Here your classifier clf is an object and fit is a method defined in the class RandomForestClassifier
Chapter 6: Object Oriented Programming 35
This property of classes is called encapsulation. From Wikipedia* — encapsulation refers to the
bundling of data with the methods that operate on that data, or the restricting of direct access to
some of an object’s components.
So here the str class bundles the data(“Hello!”) with all the methods that would operate on our data.
I would explain the second part of that statement by the end of the chapter. In the same way, the
*https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
Chapter 6: Object Oriented Programming 37
Apart from this, Class usage can also help us to make the code much more modular and easy to
maintain. So say we were to create a library like Scikit-Learn. We need to create many models, and
each model will have a fit and predict method. If we don’t use classes, we will end up with a lot
of functions for each of our different models like:
1 RFCFit
2 RFCPredict
3 SVCFit
4 SVCPredict
5 LRFit
6 LRPredict
7
8 and so on.
This sort of a code structure is just a nightmare to work with, and hence Scikit-Learn defines each
of the models as a class having the fit and predict methods.
Creating a Class
So, now we understand why to use classes and how they are so important, how do we really go
about using them? So, creating a class is pretty simple. Below is a boilerplate code for any class you
will end up writing:
1 class myClass:
2 def __init__(self, a, b):
3 self.a = a
4 self.b = b
5
6 def somefunc(self, arg1, arg2):
7 #SOME CODE HERE
We see a lot of new keywords here. The main ones are class,__init__ and self. So what are these?
Again, it is easily explained by some example.
Suppose you are working at a bank that has many accounts. We can create a class named Account
that would be used to work with any account. For example, below I create an elementary toy class
Account which stores data for a user — namely account_name and balance. It also provides us with
two methods to deposit/withdraw money to/from the bank account. Do read through it. It follows
the same structure as the code above.
Chapter 6: Object Oriented Programming 38
1 class Account:
2 def __init__(self, account_name, balance=0):
3 self.account_name = account_name
4 self.balance = balance
5
6 def deposit(self, amount):
7 self.balance += amount
8
9 def withdraw(self,amount):
10 if amount <= self.balance:
11 self.balance -= amount
12 else:
13 print("Cannot Withdraw amounts as no funds!!!")
We can create an account with a name Rahul and having an amount of 100 using:
1 myAccount = Account("Rahul",100)
But, how are these attributes balance and account_name already set to 100, and “Rahul” respectively?
We never did call the __init__ method, so why did the object gets these attribute? The answer here
is that __init__ is a magic method(There are a lot of other magic methods which I would expand
on in my next chapter on Magic Methods), which gets run whenever we create the object. So when
we create myAccount , it automatically also runs the function __init__
So now we understand __init__, let us try to deposit some money into our account. We can do this
by:
And our balance rose to 200. But did you notice that our function deposit needed two arguments
namely self and amount, yet we only provided one, and still, it works.
So, what is this self? The way I like to explain self is by calling the same function in an albeit
different way. Below, I call the same function deposit belonging to the class account and provide it
with the myAccount object and the amount. And now the function takes two arguments as it should.
Chapter 6: Object Oriented Programming 39
And our myAccount balance increases by 100 as expected. So it is the same function we have
called. Now, that could only happen if self and myAccount are exactly the same object. When
I call myAccount.deposit(100) Python provides the same object myAccount to the function call
as the argument self. And that is why self.balance in the function definition really refers to
myAccount.balance.
1 class iPhone:
2 def __init__(self, memory, user_id):
3 self.memory = memory
4 self.mobile_id = user_id
5 def call(self, contactNum):
6 # Some Implementation Here
Now, Apple plans to launch iPhone1 and this iPhone Model introduces a new functionality — The
ability to take a pic. One way to do this is to copy-paste the above code and create a new class
iPhone1 like:
1 class iPhone1:
2 def __init__(self, memory, user_id):
3 self.memory = memory
4 self.mobile_id = user_id
5 self.pics = []
6
7 def call(self, contactNum):
8 # Some Implementation Here
9
Chapter 6: Object Oriented Programming 40
10 def click_pic(self):
11 # Some Implementation Here
12 pic_taken = ...
13 self.pics.append(pic_taken)
But as you can see that is a lot of unnecessary duplication of code and Python has a solution for
removing that code duplication. One good way to write our iPhone1 class is:
1 class iPhone1(**iPhone**):
2 def __init__(self,**memory,user_id**):
3 **super().__init__(memory,user_id)**
4 self.pics = []
5 def click_pic(self):
6 # Some Implementation Here
7 pic_taken = ...
8 self.pics.append(pic_taken)
And that is the concept of inheritance. As per Wikipedia*: Inheritance is the mechanism of basing
an object or class upon another object or class retaining similar implementation. Simply put, iPhone1
has access to all the variables and methods defined in class iPhone now.
In this case, we don’t have to do any code duplication as we have inherited(taken) all the methods
from our parent class iPhone. Thus we don’t have to define the call function again. Also, we don’t
set the mobile_id and memory in the __init__ function using super.
But what is this super().__init__(memory,user_id)?
In real life, your __init__ functions won’t be these nice two-line functions. You would need to
define a lot of variables/attributes in your class and copying pasting them for the child class (here
iPhone1) becomes cumbersome. Thus there exists super(). Here super().__init__() actually calls
the __init__ method of the parent iPhone Class here. So here when the __init__ function of class
iPhone1 runs it automatically sets the memory and user_id of the class using the __init__ function
of the parent class.
Where do we see this in ML/DS/DL? Below is how we create a PyTorch† model. This model
inherits everything from the nn.Module class and calls the __init__ function of that class using the
super call.
*https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)
†https://mlwhiz.com/blog/2020/09/09/pytorch_guide/
Chapter 6: Object Oriented Programming 41
1 class myNeuralNet(nn.Module):
2
3 def __init__(self):
4 **super().__init__()**
5 # Define all your Layers Here
6 self.lin1 = nn.Linear(784, 30)
7 self.lin2 = nn.Linear(30, 10)
8
9 def forward(self, x):
10 # Connect the layer Outputs here to define the forward pass
11 x = self.lin1(x)
12 x = self.lin2(x)
13 return x
But what is Polymorphism? We are getting better at understanding how classes work so I guess I
would try to explain Polymorphism now. Look at the below class.
1 import math
2
3 class Shape:
4 def __init__(self, name):
5 self.name = name
6
7 def area(self):
8 pass
9
10 def getName(self):
11 return self.name
12
13 class Rectangle(Shape):
14 def __init__(self, name, length, breadth):
15 super().__init__(name)
16 self.length = length
17 self.breadth = breadth
18
19 def area(self):
20 return self.length*self.breadth
21
22 class Square(Rectangle):
23 def __init__(self, name, side):
24 super().__init__(name,side,side)
25
26 class Circle(Shape):
Chapter 6: Object Oriented Programming 42
Here we have our base class Shape and the other derived classes — Rectangle and Circle. Also, see
how we use multiple levels of inheritance in the Square class which is derived from Rectangle which
in turn is derived from Shape. Each of these classes has a function called area which is defined as
per the shape. So the concept that a function with the same name can do multiple things is made
possible through Polymorphism in Python. In fact, that is the literal meaning of Polymorphism:
“Something that takes many forms”. So here our function area takes multiple forms.
Another way that Polymorphism works with Python is with the isinstance method. So using the
above class, if we do:
Thus, the instance type of the object mySquare is Square, Rectangle and Shape. And hence the
object is polymorphic. This has a lot of good properties. For example, We can create a function
that works with an Shape object, and it will totally work with any of the derived classes(Square,
Circle, Rectangle etc.) by making use of Polymorphism.
Chapter 6: Object Oriented Programming 43
But you would still be able to change the variable value using(Though not recommended),
You would also be able to use the method _privatefunc using myphone._privatefunc(). If you want
to avoid that you can use double underscores in front of the variable name. For example, below the
call to print(myphone.__memory) throws an error. Also, you are not able to change the internal data
of an object by using myphone.__memory = 1.
Chapter 6: Object Oriented Programming 44
But, as you see you can access and modify these self.__memory values in your class definition in
the function setMemory for instance.
Conclusion
I hope this has been useful for you to understand classes. So, to summarize, in this chapter, we
learned about OOP and creating classes along with the various fundamentals of OOP:
To end this chapter, I would be giving an exercise for you to implement as I think it might clear
some concepts for you. Create a class that lets you manage 3d objects(sphere and cube) with
volumes and surface areas. The basic boilerplate code is given below:
Chapter 6: Object Oriented Programming 45
1 import math
2
3 class Shape3d:
4 def __init__(self, name):
5 self.name = name
6
7 def surfaceArea(self):
8 pass
9
10 def volume(self):
11 pass
12
13 def getName(self):
14 return self.name
15
16 class Cuboid():
17 pass
18
19 class Cube():
20 pass
21
22 class Sphere():
23 pass
1 import math
2
3 class Shape3d:
4 def __init__(self, name):
5 self.name = name
6
7 def surfaceArea(self):
8 pass
9
10 def volume(self):
11 pass
12
13 def getName(self):
14 return self.name
Chapter 6: Object Oriented Programming 46
15
16 class Cuboid(Shape3d):
17 def __init__(self, name, length, breadth,height):
18 super().__init__(name)
19 self.length = length
20 self.breadth = breadth
21 self.height = height
22
23 def surfaceArea(self):
24 return 2*(self.length*self.breadth + self.breadth*self.height + self.length*\
25 self.height)
26
27 def volume(self):
28 return self.length*self.breadth*self.height
29
30 class Cube(Cuboid):
31 def __init__(self, name, side):
32 super().__init__(name,side,side,side)
33
34 class Sphere(Shape3d):
35 def __init__(self, name, radius):
36 super().__init__(name)
37 self.radius = radius
38
39 def surfaceArea(self):
40 return 4*pi*self.radius**2
41
42 def volume(self):
43 return 4/3*pi*self.radius**3
Chapter 7: Dunder Methods
In my last chapter, I talked about Object-Oriented Programming(OOP). And I specifically talked
about a single magic method __init__ which is also called as a constructor method in the OOP
terminology.
The magic part of __init__ is that it gets called whenever an object is created automatically. But it
is not the only one in any sense. Python provides us with many other magic methods that we end
up using without even knowing about them. Ever used len(), print() or the [] operator on a list?
You have been using Dunder methods.
In this chapter, I would talk about five of the most used magic functions or “Dunder” methods.
In my last chapter, I talked about how everything ranging from int,str ,float in Python is an object.
We also learned how we could call methods on an object like
1 Fname = "Rahul"
2
3 # Now, we can use various methods defined in the string class using the below syntax
4
5 Fname.lower()
But, as we know, we can also use the + operator to concatenate multiple strings.
1 Lname = "Agarwal"
2 print(Fname + Lname)
3 ------------------------------------------------------------------
4 RahulAgarwal
So, Why does the addition operator work? I mean how does the String object knows what to do
when it encounters the plus sign? How do you write it in the str class? And the same + operation
happens differently in the case of integer objects. Thus the operator + behaves differently in the
case of String and Integer(Fancy people call this — operator overloading).
Chapter 7: Dunder Methods 48
So, can we add any two objects? Let’s try to add two objects from our elementary Account class.
It fails as expected since the operand + is not supported for an object of type Account. But we
can add the support of + to our Account class using our magic method __add__
1 class Account:
2 def __init__(self, account_name, balance=0):
3 self.account_name = account_name
4 self.balance = balance
5 def __add__(self,acc):
6 if isinstance(acc,Account):
7 return self.balance + acc.balance
8 raise Exception(f"{acc} is not of class Account")
Here we add a magic method __add__ to our class which takes two arguments — self and acc. We
first check if acc is of class account and if it is, we return the sum of balances when we add these
accounts. If we add anything else to an account other than an object from the Account class, we
would be shown a descriptive error. Let us try it:
Chapter 7: Dunder Methods 49
So, we can add any two objects. In fact, we also have different magic methods for a variety of other
operators.
As a running example, I will try to explain all these concepts by creating a class called Complex
to handle complex numbers. Don’t worry Complex would just be the class’s name, and I would try
to keep it as simple as possible.
So below, I create a simple method __add__ that adds two Complex numbers, or a complex number
and an int/float. It first checks if the number being added is of type int or float or Complex. Based
on the type of number being added, we do the required addition. We also use the isinstance function
to check the type of the other object. Do read the comments.
Chapter 7: Dunder Methods 50
1 import math
2 class Complex:
3 def __init__(self, re=0, im=0):
4 self.re = re
5 self.im = im
6 def __add__(self, other):
7 # If Int or Float Added, return a Complex number where float/int is added to\
8 the real part
9 if isinstance(other, int) or isinstance(other, float):
10 return Complex(self.re + other,self.im)
11 # If Complex Number added return a new complex number having a real and comp\
12 lex part
13 elif isinstance(other, Complex):
14 return Complex(self.re + other.re , self.im + other.im)
15 else:
16 raise TypeError
1 a = Complex(3,4)
2 b = Complex(4,5)
3 print(a+b)
You would now be able to understand the below code which allows us to add, subtract, multiply and
divide complex numbers with themselves as well as scalars(float, int etc.). See how these methods,
in turn, return a complex number. The below code also provides the functionality to compare two
complex numbers using __eq__,__lt__,__gt__
You really don’t need to understand all of the complex number maths, but I have tried to use most
of these magic methods in this particular class. Maybe read through __add__ and __eq__ one.
1 import math
2 class Complex:
3 def __init__(self, re=0, im=0):
4 self.re = re
5 self.im = im
6
7 def __add__(self, other):
8 if isinstance(other, int) or isinstance(other, float):
9 return Complex(self.re + other,self.im)
10 elif isinstance(other, Complex):
11 return Complex(self.re + other.re , self.im + other.im)
12 else:
Chapter 7: Dunder Methods 51
13 raise TypeError
14
15 def __sub__(self, other):
16 if isinstance(other, int) or isinstance(other, float):
17 return Complex(self.re - other,self.im)
18 elif isinstance(other, Complex):
19 return Complex(self.re - other.re, self.im - other.im)
20 else:
21 raise TypeError
22
23 def __mul__(self, other):
24 if isinstance(other, int) or isinstance(other, float):
25 return Complex(self.re * other, self.im * other)
26 elif isinstance(other, Complex):
27 # (a+bi)*(c+di) = ac + adi +bic -bd
28 return Complex(self.re * other.re - self.im * other.im,
29 self.re * other.im + self.im * other.re)
30 else:
31 raise TypeError
32
33 def __truediv__(self, other):
34 if isinstance(other, int) or isinstance(other, float):
35 return Complex(self.re / other, self.im / other)
36 elif isinstance(other, Complex):
37 x = other.re
38 y = other.im
39 u = self.re
40 v = self.im
41 repart = 1/(x**2+y**2)*(u*x + v*y)
42 impart = 1/(x**2+y**2)*(v*x - u*y)
43 return Complex(repart,impart)
44 else:
45 raise TypeError
46
47 def value(self):
48 return math.sqrt(self.re**2 + self.im**2)
49
50 def __eq__(self, other):
51 if isinstance(other, int) or isinstance(other, float):
52 return self.value() == other
53 elif isinstance(other, Complex):
54 return self.value() == other.value()
55 else:
Chapter 7: Dunder Methods 52
56 raise TypeError
57
58 def __lt__(self, other):
59 if isinstance(other, int) or isinstance(other, float):
60 return self.value() < other
61 elif isinstance(other, Complex):
62 return self.value() < other.value()
63 else:
64 raise TypeError
65
66 def __gt__(self, other):
67 if isinstance(other, int) or isinstance(other, float):
68 return self.value() > other
69 elif isinstance(other, Complex):
70 return self.value() > other.value()
71 else:
72 raise TypeError
1 class Complex:
2 def __init__(self, re=0, im=0):
3 self.re = re
4 self.im = im
5
6 .....
7 .....
8
9 def __str__(self):
10 if self.im>=0:
11 return f"{self.re}+{self.im}i"
12 else:
13 return f"{self.re}{self.im}i"
So now, our object gets printed in a better way. But still, if we try to do the below in our notebook,
the __str__ method is not called:
This is because we are not printing in the above code, and thus the __str__ method doesn’t get
called. In this case, another magic method called __repr__ gets called. We can just add this in our
class to get the same result as a print(we could also have implemented it differently). It’s a dunder
method inside a dunder method. Pretty nice!!!
1 def __repr__(self):
2 return self.__str__()
Chapter 7: Dunder Methods 54
1 class Complex:
2 def __init__(self, re=0, im=0):
3 self.re = re
4 self.im = im
5
6 ......
7 ......
8
9 def __len__(self):
10 # This function return type needs to be an int
11 return int(math.sqrt(self.re**2 + self.im**2))
This brings us to another set of dunder methods called assignment dunder methods which include
__iadd__,__isub__,__imul__,__itruediv__ and many others.
So If we just add the following method __iadd__ to our class, we would be able to make assignment
based additions too.
1 class Complex:
2 .....
3 def __iadd__(self, other):
4 if isinstance(other, int) or isinstance(other, float):
5 return Complex(self.re + other,self.im)
6 elif isinstance(other, Complex):
7 return Complex(self.re + other.re , self.im + other.im)
8 else:
9 raise TypeError
1 class TrasnsactionBook:
2 def __init__(self, user_id, shares=[]):
3 self.user_id = user_id
4 self.shares = shares
5 def add_trade(self, name , quantity, buySell):
6 self.shares.append([name,quantity,buySell])
7 def __getitem__(self, i):
8 return self.shares[i]
Chapter 7: Dunder Methods 56
Do you notice the __getitem__ here? This actually allows us to use indexing on objects of this
particular class using:
We can get the first trade done by the user or the second one based on the index we use. This is
just a simple example, but you can set your object up to get a lot more information when you use
indexing.
Conclusion
Python is a magical language, and there are many constructs in Python that even advanced users
may not know about. Dunder methods might be very well one of them. I hope with this chapter,
you get a good glimpse of various dunder methods that Python offers and also understand how to
implement them yourself. If you want to know about even more dunder methods, do take a look at
this blog* from Rafe Kettler.
*https://rszalski.github.io/magicmethods/#comparisons
Chapter 8: Regexes Everywhere!!!
One of the main tasks while working with text data is to create a lot of text-based features. One could
like to find out certain patterns in the text, emails if present in a text as well as phone numbers in a
large text.
While it may sound fairly trivial to achieve such functionalities it is much simpler if we use the
power of Python’s regex module.
For example, let’s say you are tasked with finding the number of punctuations in a particular piece
of text. Using text from Dickens here. How do you normally go about it?
A simple enough way is to do something like:
1 target = [';','.',',','–']
2
3 string = "It was the best of times, it was the worst of times, it was the age of wis\
4 dom, it was the age of foolishness, it was the epoch of belief, it was the epoch of
5 incredulity, it was the season of Light, it was the season of Darkness, it was the s
6 pring of hope, it was the winter of despair, we had everything before us, we had not
7 hing before us, we were all going direct to Heaven, we were all going direct the oth
8 er way – in short, the period was so far like the present period, that some of its n
9 oisiest authorities insisted on its being received, for good or for evil, in the sup
10 erlative degree of comparison only.**"
11
12 num_puncts = 0
13 for punct in target:
14 if punct in string:
15 num_puncts+=string.count(punct)
16
17 print(num_puncts)
1 19
And that is all but fine if we didn’t have the re module at our disposal. With re it is simply 2 lines
of code:
Chapter 8: Regexes Everywhere!!! 58
1 import re
2 pattern = r"[;.,–]"
3 print(len(re.findall(pattern,string)))
1 19
This chapter is about one of the most commonly used regex patterns and some regex
functions I end up using regularly. If you would like to see a more interactive version of
this chapter you can check it on my blog*
What is regex?
In simpler terms, a regular expression(regex) is used to find patterns in a given string.
The pattern we want to find could be anything.
We can create patterns that resemble an email or a mobile number. We can create patterns that find
out words that start with a and ends with z from a string.
In the above example:
1 import re
2
3 pattern = r'[,;.,–]'
4 print(len(re.findall(pattern,string)))
The pattern we wanted to find out was [,;.,–]. This pattern captures any of the 4 characters we
wanted to capture. I find regex101† a great tool for testing patterns. This is how the pattern looks
when applied to the target string.
*https://mlwhiz.com/blog/2019/09/01/regex/
†https://regex101.com/
Chapter 8: Regexes Everywhere!!! 59
As we can see we are able to find all the occurrences of ,;.,– in the target string as required.
I use the above tool whenever I need to test a regex. Much faster than running a python program
again and again and much easier to debug.
So now we know that we can find patterns in a target string but how do we really create these
patterns?
Chapter 8: Regexes Everywhere!!! 60
Creating Patterns
The first thing we need to learn while using regex is how to create patterns.
I will go through some most commonly used patterns one by one.
As you would think, the simplest pattern is a simple string.
1 pattern = r'times'
2 string = "It was the best of times, it was the worst of times."
3 print(len(re.findall(pattern,string)))
But that is not very useful. To help with creating complex patterns regex provides us with special
characters/operators.
1. the [] operator
This is the one we used in our first example. We want to find one instance of any character
within these square brackets.
[abc]- will find all occurrences of a or b or c.
1 pattern = r'[a-zA-Z]'
2 string = "It was the best of times, it was the worst of times."
3 print(len(re.findall(pattern,string)))
There are other functionalities in regex apart from .findall but we will get to them a little bit later.
The dot character is used to get a single instance of any character. What if we want to find more.
The Plus character +, is used to signify 1 or more instance of the leftmost character.
The Star character *, is used to signify 0 or more instance of the leftmost character.
For example, if we want to find out all substrings that start with d and end with e, we can have zero
characters or more characters between d and e. We can use: d\w*e
Chapter 8: Regexes Everywhere!!! 64
If we want to find out all substrings that start with d and end with e with at least one character
between d and e, we can use: d\w+e
6. Word Boundary
This is an important concept. Did you notice how I always matched substring and never a word in
the above examples?
So, what if we want to find all words that start with d? We use word boundary using \b which
highlights the start of the word for us.
Can we use d\w* as the pattern? Let’s see using the web tool.
Chapter 8: Regexes Everywhere!!! 66
Regex Functions
Till now we have only used the findall function from the re package, but it also supports a lot more
functions. Let us look into the functions one by one.
1. findall
We already have used findall. It is one of the regex functions I end up using most often. Let us
understand it a little more formally.
Input: Pattern and test string
Output: List of strings.
1 #USAGE:
2
3 pattern = r'[iI]t'
4 string = "It was the best of times, it was the worst of times."
5
6 matches = re.findall(pattern,string)
7
8 for match in matches:
9 print(match)
1 It
2 it
Chapter 8: Regexes Everywhere!!! 67
2. Search
1 #USAGE:
2
3 pattern = r'[iI]t'
4 string = "It was the best of times, it was the worst of times."
5
6 location = re.search(pattern,string)
7 print(location)
1 print(location.group())
1 'It'
3. Substitute
This is another great functionality. When you work with NLP you sometimes need to substitute
integers with X’s. Or you might need to redact some document. Just the basic find and replace in
any of the text editors.
Input: search pattern, replacement pattern, and the target string
Output: Substituted string
1 string = "It was the best of times, it was the worst of times."
2 string = re.sub(r'times', r'life', string)
3 print(string)
1. PAN Numbers
In India, we have got PAN Numbers for Tax identification rather than SSN numbers in the US. The
basic validation criteria for PAN is that it must have all its letters in uppercase and characters in the
following order:
1 <char><char><char><char><char><digit><digit><digit><digit><char>
1 match=re.search(r'[A-Z]{5}[0–9]{4}[A-Z]','ABcDE1234L')
2 if match:
3 print(True)
4 else:
5 print(False)
1 False
Chapter 8: Regexes Everywhere!!! 70
Sometimes we have got a large text document and we have got to find out instances of telephone
numbers or email IDs or domain names from the big text document.
For example, Suppose you have this text:
And you need to find out all the primary domains from this text- askoxford.com; bnsf.com;
hydrogencarsnow.com; mrvc.indianrail.gov.in; web.archive.org
1 match=re.findall(r'http(s:|:)\/\/([www.|ww2.|)([0-9a-z.A-Z-]*\.\w{2,3})',string)](ht\
2 tp://www.|ww2.|)([0-9a-z.A-Z-]*\.\w{2,3})',string))
3 for elem in match:
4 print(elem)
Chapter 8: Regexes Everywhere!!! 72
| is the or operator here and match returns tuples where the pattern part inside () is kept.
1 match=re.findall(r'([\w0-9-._]+@[\w0-9-.]+[\w0-9]{2,3})',string)
These are advanced examples but if you try to understand these examples for yourself you should
be fine with the info provided.
Chapter 8: Regexes Everywhere!!! 73
Conclusion
While it might look a little daunting at first, regex provides a great degree of flexibility when it
comes to data manipulation, creating features and finding patterns.
I use it quite regularly when I work with text data and it can also be included while working on data
validation tasks.
I am also a fan of the regex101 tool and use it frequently to check my regexes. I wonder if I would
be using regexes as much if not for this awesome tool.
Chapter 9: Type Annotations with
Python
Till now, we have been using Python without specifying types of variables. For example, we would
set a variable as a=10 and not int a =10. In more technical jargon, we call this as Python being a
dynamically typed language, meaning that the variable type is determined at runtime based on the
value it is assigned. So, if a variable is assigned a value of 10, Python will interpret it as an integer,
while if a variable is assigned a value of 10.5 Python will consider it as a float, and so on.
For most purposes, it works great and allows for great flexibility and simplicity in writing Python
programs. But, it can make it challenging to understand the behavior of a program, especially when
dealing with complex or large codebases.
To address this issue, Python introduced type annotations in version 3.5, which allows developers
to specify the type of a variable, function, or class in the source code.
In this chapter, we will take a closer look at type annotations in Python and explain how they are
used to improve the readability and reliability of your Python programs.
1 x: int = 5
In this example, the x variable is annotated with the int type, which indicates that it should contain
an integer value. The type annotation comes after the variable name and before the assignment
operator (=), and it does not affect the value assigned to the variable. Please note that the variable
type is still determined at runtime based on the value assigned to it, but the type annotation
provides additional information about the expected type of the variable.
Similarly, to annotate the types of a function, you can write:
Chapter 9: Type Annotations with Python 75
Here, we specify the types of parameters (x and y) and the type of return value of the add function.
The return type is specified using an arrow symbol.
There are several other ways to use type annotations in Python:
1. Class attributes
You can specify the type of a class attribute using a type annotation. For example:
1 class Person:
2 name: str
3 age: int
In this example, the name attribute is a string, and the age attribute is an integer.
1. Improved readability
Type annotations can improve the readability of your code by providing explicit information about
the expected type of a variable or function. This can make it easier for other developers (or even
yourself) to understand the behavior of a program, especially when dealing with complex or large
codebases. For instance, consider the following code without type annotations:
Chapter 9: Type Annotations with Python 76
In this example, the add function takes two arguments, x and y, and returns their sum. However,
without type annotations, it is not clear what type of values the x and y arguments should have, or
what type of value the add function will return. This can lead to confusion or errors, especially if
the function is called with arguments of a different type than the ones it expects.
To address this issue, we can add type annotations to the add function as follows:
In this example, the add function is annotated with the types of its parameters (x and y) and its return
value (int). This makes it clear that the add function expects two integer arguments and returns an
integer value.
You can then fix these type errors and re-run mypy until all of the type errors have been resolved.
Using static type checking with mypy can help you catch type errors early in the development process
and ensure that your code is correct and stable.
Chapter 9: Type Annotations with Python 77
In this example, the add function expects two arguments (x and y) that can be either integers or floats
and returns a value that can be either an integer or a float. The Union type allows you to specify
multiple possible types, which can be helpful in situations where a value can be one of several
different types.
In this example, the greet function expects a string argument (name) and an optional string argument
(greeting). The greeting argument is annotated with Optional[str]`` and has a default value
of 'Hello,' so it is optional and does not need to be provided when the function is
called. Using the Optional‘ type can be useful to specify that an argument or return value is
optional and can be omitted if desired.
In this example, the process function expects an argument (x) that can be any type. The Any type
is often used when a value can be any type, and it is not necessary or desirable to specify the exact
type.
Summary
• Type annotations in Python are optional and do not affect the runtime behavior of a program.
• They provide information about the type of a variable or function for static analysis tools and
enable type checking at runtime.
• They can improve the readability and reliability of Python programs by providing explicit
information about the expected type of a variable or function, enabling static analysis tools to
detect potential errors, and enabling type specialization.
• Type-checking tools, such as mypy or typeguard, can be used to validate the types of your
variables and functions at runtime and enforce the type annotations in your code.
Further Reading
• The Python documentation* provides detailed information about type annotations in Python,
including the syntax, semantics, and use cases.
• The mypy documentation† explains how to use mypy to enable type checking and type
specialization in Python.
• The typeguard documentation‡ explains how to use typeguard to validate the types of your
variables and functions at runtime.
• The PEP 484§ describes the design and motivation of type annotations in Python.
That concludes this chapter on Python’s type annotations. I hope you have learned something new
and valuable about type annotations in Python.
*https://docs.python.org/3/library/typing.html
†http://mypy-lang.org/index.html
‡https://typeguard.readthedocs.io/en/latest/
§https://peps.python.org/pep-0484/