0% found this document useful (0 votes)
48 views

Advancedpythontips Updated

The document discusses advanced Python concepts across multiple chapters, including minimizing for loops, defaultdict, args and kwargs, iterators and generators, f-strings, object oriented programming, regexes, and type annotations. It provides examples and explanations of these concepts.

Uploaded by

orcun_laing
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Advancedpythontips Updated

The document discusses advanced Python concepts across multiple chapters, including minimizing for loops, defaultdict, args and kwargs, iterators and generators, f-strings, object oriented programming, regexes, and type annotations. It provides examples and explanations of these concepts.

Uploaded by

orcun_laing
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 82

Advanced Python Tips

A Simple Book on Advanced Python concepts

Rahul Agarwal
This book is for sale at http://leanpub.com/advancedpythontips

This version was published on 2022-12-27

This is a Leanpub book. Leanpub empowers authors and publishers with the Lean Publishing
process. Lean Publishing is the act of publishing an in-progress ebook using lightweight tools and
many iterations to get reader feedback, pivot until you have the right book and build traction once
you do.

© 2022 Rahul Agarwal


Contents

About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Chapter 1: Minimize for loop usage in Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4


Try Using Dictionary Comprehension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Chapter 2: Python defaultdict and Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9


So, why ever use defaultdict ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Chapter 3: *args, **kwargs, decorators for Data Scientists . . . . . . . . . . . . . . . . . . . . 13


What are *args? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
What are **kwargs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
What are Decorators? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Connecting all the pieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Chapter 4: Use Iterators, Generators, and Generator Expressions . . . . . . . . . . . . . . . . 21


The Problem Statement: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
The Iterator Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
The Generator Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Generator Expression Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Chapter 5: How and Why to use f strings in Python3? . . . . . . . . . . . . . . . . . . . . . . . 29


3 Common Ways of Printing: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
The Fourth Way with f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Chapter 6: Object Oriented Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34


What are Objects and Classes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
But Why Classes? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
CONTENTS

Creating a Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
But, still, some problems remain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Some More Info: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Chapter 7: Dunder Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47


1. Operator Magic methods: __add__, __sub__, __mul__, __truediv__, __lt__,__gt__ . . . . 47
2. But why does the complex number print as this random string? — __str__ and __repr__ 52
3. I am getting you. So, Is that how the len method works too? __len__ . . . . . . . . . . . . 54
4. But what about the Assignment operations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5. Can your class object support indexing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Chapter 8: Regexes Everywhere!!! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57


What is regex? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Creating Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Regex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Some Case Studies: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Chapter 9: Type Annotations with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


How do I use them? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
But why even use Type Annotations? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Advanced techniques for using type annotations in Python . . . . . . . . . . . . . . . . . . . . 77
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
About This Book
This is the ‘Forever Edition’ on LeanPub*. That means when the book is published as it’s written
and you may see periodic updates. If you purchased this book, thank you. Know that writing a book
like this takes hundreds of hours of effort so it is always nice when someone recognizes your effort.
Please treat your copy of the book as your own personal copy-it isn’t to be uploaded anywhere, and
you aren’t meant to give copies to other people. I’ve made sure to provide a DRM-free file (excepting
any DRM added by a bookseller other than LeanPub) so that you can use your copy any way that’s
convenient for you. I appreciate your respecting my rights and not making unauthorized copies of
this work.
If you got this book for free from someplace, know that you are making it difficult for me to write
books. When I can’t make even a small amount of money from my books, I’m encouraged to stop
writing them. If you find this book useful, I would greatly appreciate you purchasing a copy from
LeanPub.com or another bookseller. When you do, you’ll be letting me know that books like this are
useful to you, and that you want people like me to continue creating them. -MLWhiz
*http://leanpub.com/
About the Author
I am Rahul Agarwal(MLWhiz), a Data Scientist Consultant, and Machine Learning Engineer based
in London . Previously, I have worked at startups like Fractal and MyCityWay and conglomerates like
Citi and Walmart. I started my blog mlwhiz.com with a purpose to augment my own understanding
of new things while helping others learn about them. I also write for publications on Medium like
Towards Data Science and HackerNoon
Personally I am tool agnostic. I like learning new tools and constantly work to add up new skills
as I face new problems that cannot be accomplished with my current set of techniques. But the
tools that get most of my work done currently are Python and Spark. I also really like working with
data-intensive problems and am constantly in search of new ideas to work on.
Preface
Learning a language is easy. Whenever I start with a new language, I focus on a few things in
the below order, and it is a breeze to get started with writing code in any language.

• Operators and Data Types: +,-,int,float,str


• Conditional statements: if,else,case,switch
• Loops: For, while
• Data structures: List, Array, Dict, Hashmaps
• Define Function

However, learning to write a language and writing a language in an optimized way are two
different things.
Every Language has some ingredients which make it unique.
Yet, a new programmer to any language will always do some forced overfitting. A Java
programmer, new to python, for example, might write this code to add numbers in a list.

1 x=[1,2,3,4,5]
2 sum_x = 0
3 for i in range(len(x)):
4 sum_x+=x[i]

While a Python programmer will naturally do this:

1 sum_x = sum(x)

In this book, I will explain some simple constructs provided by Python, some essential tips, and some
use cases I come up with regularly in my Data Science work. Most of the book is of a practical nature
and you will find it beaming with examples.
This book is about efficient and readable code.
If you like this book, I would appreciate it if you could buy* the paid version here.
*htps://somehyperlink.com
Chapter 1: Minimize for loop usage in
Python
There are many ways to write a for loop in Python. A beginner may get confused on what to use.
Let me explain this with a simple example statement.
Suppose you want to take the sum of squares in a list.
This is a valid problem we all face in machine learning whenever we want to calculate the distance
between two points in n dimension.
You can do this using loops easily.
In fact, I will show you three ways to do the same task which I have seen people use and let
you choose for yourself which you find the best.

1 x = [1,3,5,7,9]
2 sum_squared = 0
3
4 for i in range(len(x)):
5 sum_squared+=x[i]**2

Whenever I see the above code in a python codebase, I understand that the person has come from
C or Java background.
A slightly more pythonic way of doing the same thing is:

1 x = [1,3,5,7,9]
2 sum_squared = 0
3
4 for y in x:
5 sum_squared+=y**2

Better.
I didn’t index the list. And my code is more readable.
But still, the pythonic way to do it is in one line.
Chapter 1: Minimize for loop usage in Python 5

1 x = [1,3,5,7,9]
2 sum_squared = sum([y**2 for y in x])

This approach is called List Comprehension, and this may very well be one of the reasons
that I love Python.
You can also use if in a list comprehension.
Let’s say we wanted a list of squared numbers for even numbers only.

1 x = [1,2,3,4,5,6,7,8,9]
2 even_squared = [y**2 for y in x if y%2==0]
3 --------------------------------------------
4 [4,16,36,64]

if-else?
What if we wanted to have the number squared for even and cubed for odd?

1 x = [1,2,3,4,5,6,7,8,9]
2 squared_cubed = [y**2 if y%2==0 else y**3 for y in x]
3 --------------------------------------------
4 [1, 4, 27, 16, 125, 36, 343, 64, 729]

Great!!!
Chapter 1: Minimize for loop usage in Python 6

So basically follow specific guidelines: Whenever you feel like writing a for statement, you should
ask yourself the following questions,

• Can it be done without a for loop? Most Pythonic


• Can it be done using list comprehension? If yes, use it.
• Can I do it without indexing arrays? if not, think about using enumerate

What is enumerate?
Sometimes we need both the index in an array as well as the value in an array.
In such cases, I prefer to use enumerate rather than indexing the list.
Chapter 1: Minimize for loop usage in Python 7

1 L = ['blue', 'yellow', 'orange']


2 for i, val in enumerate(L):
3 print("index is %d and value is %s" % (i, val))
4 ---------------------------------------------------------------
5 index is 0 and value is blue
6 index is 1 and value is yellow
7 index is 2 and value is orange

The rule is:

Never index a list, if you can do without it.

Try Using Dictionary Comprehension


Also try using dictionary comprehension, which is a relatively new addition in Python. The syntax
is pretty similar to List comprehension.
Let me explain using an example. I want to get a dictionary with (key: squared value) for every value
in x.

1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 for k in x}
3 ---------------------------------------------------------
4 {1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

What if I want a dict only for even values?

1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 for k in x if x%2==0}
3 ---------------------------------------------------------
4 {2: 4, 4: 16, 6: 36, 8: 64}

What if we want squared value for even key and cubed number for the odd key?

1 x = [1,2,3,4,5,6,7,8,9]
2 {k:k**2 if k%2==0 else k**3 for k in x}
3 ---------------------------------------------------------
4 {1: 1, 2: 4, 3: 27, 4: 16, 5: 125, 6: 36, 7: 343, 8: 64, 9: 729}
Chapter 1: Minimize for loop usage in Python 8

Conclusion
To conclude, I will say that while it might seem easy to transfer the knowledge you acquired from
other languages to Python, you won’t be able to appreciate the beauty of Python if you keep doing
that. Python is much more powerful when we use its ways and decidedly much more fun.
So, use List Comprehensions and Dict comprehensions when you need afor loop. Use enumer-
ate if you need array index.

Avoid for loops like plague

Your code will be much more readable and maintainable in the long run.
Chapter 2: Python defaultdict and
Counter

Let’s say I need to count the number of word occurrences in a piece of text. Maybe for a book
like Hamlet. How could I do that?
Python always provides us with multiple ways to do the same thing. But only one way that I find
elegant.
This is a Naive Python implementation using the dict object.
Chapter 2: Python defaultdict and Counter 10

1 text = "I need to count the number of word occurrences in a piece of text. How could\
2 I do that? Python provides us with multiple ways to do the same thing. But only one
3 way I find beautiful."
4
5 word_count_dict = {}
6 for w in text.split(" "):
7 if w in word_count_dict:
8 word_count_dict[w]+=1
9 else:
10 word_count_dict[w]=1

We could use defaultdict to reduce the number of lines in the code.

1 from Collections import defaultdict


2 word_count_dict = defaultdict(int)
3 for w in text.split(" "):
4 word_count_dict[w]+=1

We could also have used Counter to do this.

1 from Collections import Counter


2 word_count_dict = Counter()
3 for w in text.split(" "):
4 word_count_dict[w]+=1

If we use Counter, we can also get the most common words using a simple function.

1 word_count_dict.most_common(10)
2 ---------------------------------------------------------------
3 [('I', 3), ('to', 2), ('the', 2)]

Other use cases of Counter:


Chapter 2: Python defaultdict and Counter 11

1 # Count Characters
2 Counter('abccccccddddd')
3 ---------------------------------------------------------------
4 Counter({'a': 1, 'b': 1, 'c': 6, 'd': 5})
5
6 # Count List elements
7 Counter([1,2,3,4,5,1,2])
8 ---------------------------------------------------------------
9 Counter({1: 2, 2: 2, 3: 1, 4: 1, 5: 1})

So, why ever use defaultdict ?


Notice that in Counter, the value is always an integer.
What if we wanted to parse through a list of tuples and wanted to create a dictionary of key and list
of values.
The main functionality provided by a defaultdict is that it defaults a key to empty/zero if it is not
found in the defaultdict.

1 s = [('color', 'blue'), ('color', 'orange'), ('color', 'yellow'), ('fruit', 'banana'\


2 ), ('fruit', 'orange'),('fruit','banana')]
3
4 d = defaultdict(list)
5
6 for k, v in s:
7 d[k].append(v)
8
9 print(d)
10 ---------------------------------------------------------------
11 defaultdict(<class 'list'>, {'color': ['blue', 'orange', 'yellow'], 'fruit': ['banan\
12 a', 'orange', 'banana']})

banana comes two times in fruit, we could use set


Chapter 2: Python defaultdict and Counter 12

1 d = defaultdict(set)
2
3 for k, v in s:
4 d[k].add(v)
5
6 print(d)
7 ---------------------------------------------------------------
8 defaultdict(<class 'set'>, {'color': {'yellow', 'blue', 'orange'}, 'fruit': {'banana\
9 ', 'orange'}})

Conclusion
To conclude, I will say that there is always a beautiful way to do anything in Python. Search
for it before you write code. Going to StackOverflow is okay. I go there a lot of times when I get
stuck. Always Remember:

Creating a function for what already is provided is not pythonic.


Chapter 3: *args, **kwargs,
decorators for Data Scientists
Python has a lot of constructs that are reasonably easy to learn and use in our code. Then there are
some constructs which always confuse us when we encounter them in our code.
Then are some that even seasoned programmers are not able to understand. *args, **kwargs and
decorators are some constructs that fall into this category.
I guess a lot of my data science friends have faced them too.
Most of the seaborn functions use *args and **kwargs in some way or other.

Or what about decorators?


Every time you see a warning like some function will be deprecated in the next version. The sklearn
package uses decorators for that. You can see the @deprecated in the source code. That is a decorator
function.

What are *args?


In simple terms,you can use *args to give an arbitrary number of inputs to your function.
Chapter 3: *args, **kwargs, decorators for Data Scientists 14

A simple example:
Let us say we have to create a function that adds two numbers. We can do this easily in python.

1 def adder(x,y):
2 return x+y

What if we want to create a function to add three variables?

1 def adder(x,y,z):
2 return x+y+z

What if we want the same function to add an unknown number of variables? Please note that we
can use *args or *argv or *anyOtherName to do this. It is the * that matters.

1 def adder(*args):
2 result = 0
3 for arg in args:
4 result+=arg
5 return result

What *args does is that it takes all your passed arguments and provides a variable length argument
list to the function which you can use as you want.
Now you can use the same function as follows:

1 adder(1,2)
2 adder(1,2,3)
3 adder(1,2,5,7,8,9,100)

and so on.
Now, have you ever thought how the print function in python could take so many arguments?
*args
Chapter 3: *args, **kwargs, decorators for Data Scientists 15

What are **kwargs?

In simple terms,you can use **kwargs to give an arbitrary number of Keyworded inputs to your
function and access them using a dictionary.

A simple example:
Let’s say you want to create a print function that can take a name and age as input and print that.

1 def myprint(name,age):
2 print(f'{name} is {age} years old')

Simple. Let us now say you want the same function to take two names and two ages.

1 def myprint(name1,age1,name2,age2):
2 print(f'{name1} is {age1} years old')
3 print(f'{name2} is {age2} years old')

You guessed right my next question is: What if I don’t know how many arguments I am going
to need?
Can I use *args? Guess not since name and age order is essential. We don’t want to write “28 is
Michael years old”.
Come **kwargs in the picture.

1 def myprint(**kwargs):
2 for k,v in kwargs.items():
3 print(f'{k} is {v} years old')

You can call this function using:

1 myprint(Sansa=20,Tyrion=40,Arya=17)
Chapter 3: *args, **kwargs, decorators for Data Scientists 16

1 Output:
2 -----------------------------------
3 Sansa is 20 years old
4 Tyrion is 40 years old
5 Arya is 17 years old

Remember we never defined Sansa or Arya or Tyrion as our methods arguments.


That is a pretty powerful concept. And many programmers utilize this pretty cleverly when they
write wrapper libraries.
For example, seaborn.scatterplot function wraps the plt.scatter function from Matplotlib.
Essentially, using *args and **kwargs we can provide all the arguments that plt.scatter can take
to seaborn.Scatterplot as well.
This can save a lot of coding effort and also makes the code future proof. If at any time in the future
plt.scatter starts accepting any new arguments the seaborn.Scatterplot function will still work.
Chapter 3: *args, **kwargs, decorators for Data Scientists 17

What are Decorators?

In simple terms: Decorators are functions that wrap another function thus modifying its
behavior.

A simple example:
Let us say we want to add custom functionality to some of our functions. The functionality is that
whenever the function gets called the “function name begins” is printed and whenever the function
ends the “function name ends” and time taken by the function is printed.
Let us assume our function is:

1 def somefunc(a,b):
2 output = a+b
3 return output

We can add some print lines to all our functions to achieve this.
Chapter 3: *args, **kwargs, decorators for Data Scientists 18

1 import time
2 def somefunc(a,b):
3 print("somefunc begins")
4 start_time = time.time()
5 output = a+b
6 print("somefunc ends in ",time.time()-start_time, "secs")
7 return output
8
9 out = somefunc(4,5)

1 OUTPUT:
2 -------------------------------------------
3 somefunc begins
4 somefunc ends in 9.5367431640625e-07 secs

But, Can we do better?


This is where decorators excel. We can use decorators to wrap any function.

1 from functools import wraps


2
3 def timer(func):
4 [@wraps](http://twitter.com/wraps)(func)
5 def wrapper(a,b):
6 print(f"{func.__name__!r} begins")
7 start_time = time.time()
8 func(a,b)
9 print(f"{func.__name__!r} ends in {time.time()-start_time} secs")
10 return wrapper

This is how we can define any decorator. functools helps us create decorators using wraps. In essence,
we do something before any function is called and do something after a function is called in the above
decorator.
We can now use this timer decorator to decorate our function somefunc

1 @timer
2 def somefunc(a,b):
3 output = a+b
4 return output

Now calling this function, we get:


Chapter 3: *args, **kwargs, decorators for Data Scientists 19

1 a = somefunc(4,5)

1 Output
2 ---------------------------------------------
3 'somefunc' begins
4 'somefunc' ends in 2.86102294921875e-06 secs

Now we can append @timer to each of our function for which we want to have the time printed.
And we are done.
Really?

Connecting all the pieces

What if our function takes three arguments? Or many arguments?


This is where whatever we have learned till now connects. We use *args and **kwargs
We change our decorator function as:
Chapter 3: *args, **kwargs, decorators for Data Scientists 20

1 from functools import wraps


2
3 def timer(func):
4 [@wraps](http://twitter.com/wraps)(func)
5 def wrapper(*args,**kwargs):
6 print(f"{func.__name__!r} begins")
7 start_time = time.time()
8 func(*args,**kwargs)
9 print(f"{func.__name__!r} ends in {time.time()-start_time} secs")
10 return wrapper

Now our function can take any number of arguments, and our decorator will still work.

Isn’t Python Beautiful?

In my view, decorators could be pretty helpful. I provided only one use case of decorators, but there
are several ways one can use them.
You can use a decorator to debug code by checking which arguments go in a function. Or a decorator
could be used to count the number of times a particular function has been called. This could help
with counting recursive calls.

Conclusion
In this chapter, I talked about some of the constructs you can find in python source code and how
you can understand them.
It is not necessary that you end up using them in your code now. But I guess understanding how these
things work helps mitigate some of the confusion and panic one faces whenever these constructs
come up.
Chapter 4: Use Iterators, Generators,
and Generator Expressions
Python in many ways has made our life easier when it comes to programming.
With its many libraries and functionalities, sometimes we forget to focus on some of the useful
things it offers.
One of such functionalities are generators and generator expressions. I stalled learning about them
for a long time but they are useful.
Have you ever encountered yield in Python code and didn’t knew what it meant? or what does an
iterator or a generator means and why we use it? Or have you used ImageDataGenerator while
working with Keras and didn’t understand what is going at the backend? Then this chapter is for
you.
Chapter 4: Use Iterators, Generators, and Generator Expressions 22

The Problem Statement:

Let us say that we need to run a for loop over 10 Million Prime numbers.
I am using prime numbers in this case for understanding but it could be extended to a case where
we have to process a lot of images or files in a database or big data.
How would you proceed with such a problem?
Simple. We can create a list and keep all the prime numbers there.
Really? Think of the memory such a list would occupy.
It would be great if we had something that could just keep the last prime number we have checked
and returns just the next prime number.
That is where iterators could help us.
Chapter 4: Use Iterators, Generators, and Generator Expressions 23

The Iterator Solution


We create a class named primes and use it to generate primes.

1 def check_prime(number):
2 for divisor in range(2, int(number ** 0.5) + 1):
3 if number % divisor == 0:
4 return False
5 return True
6
7 class Primes:
8 def __init__(self, max):
9 # the maximum number of primes we want generated
10 self.max = max
11 # start with this number to check if it is a prime.
12 self.number = 1
13 # No of primes generated yet. We want to StopIteration when it reaches max
14 self.primes_generated = 0
15 def __iter__(self):
16 return self
17 def __next__(self):
18 self.number += 1
19 if self.primes_generated >= self.max:
20 raise StopIteration
21 elif check_prime(self.number):
22 self.primes_generated+=1
23 return self.number
24 else:
25 return self.__next__()

We can then use this as:

1 prime_generator = Primes(10000000)
2
3 for x in prime_generator:
4 # Process Here

Here I have defined an iterator. This is how most of the functions like xrange or ImageGenerator
work.
Every iterator needs to have:

1. an __iter__ method that returns self, and


Chapter 4: Use Iterators, Generators, and Generator Expressions 24

2. an __next__ method that returns the next value.


3. a StopIteration exception that signifies the ending of the iterator.

Every iterator takes the above form and we can tweak the functions to our liking in this boilerplate
code to do what we want to do.
See that we don’t keep all the prime numbers in memory just the state of the iterator like

• what max prime number we have returned and


• how many primes we have returned already.

But it seems a little too much code. Can we do better?

The Generator Solution

Put simply Generators provide us ways to write iterators easily using the yield statement.
Chapter 4: Use Iterators, Generators, and Generator Expressions 25

1 def Primes(max):
2 number = 1
3 generated = 0
4 while generated < max:
5 number += 1
6 if check_prime(number):
7 generated+=1
8 yield number

we can use the function as:

1 prime_generator = Primes(10)
2 for x in prime_generator:
3 # Process Here

It is so much simpler to read. But what is yield?


We can think of yield as a return statement only as it returns the value.
But when a yield happens the state of the function is also saved in the memory. So at every iteration
in for loop the function variables like number, generated and max are stored somewhere in memory.
So what is happening is that the above function is taking care of all the boilerplate code for us by
using the yield statement.
Much More pythonic.
Chapter 4: Use Iterators, Generators, and Generator Expressions 26

Generator Expression Solution

While not explicitly better than the previous solution but we can also use Generator expression for
the same task. But we might lose some functionality here. They work exactly like list comprehensions
but they don’t keep the whole list in memory.

1 primes = (i for i in range(1,100000000) if check_prime(i))


2
3 for x in primes:
4 # do something

Functionality loss: We can generate primes till 10M. But we can’t generate 10M primes. One can
only do so much with generator expressions.
But generator expressions let us do some pretty cool things.
Let us say we wanted to have all Pythagorean Triplets lower than 1000.
How can we get it?
Chapter 4: Use Iterators, Generators, and Generator Expressions 27

Using a generator, now we know how to use them.

1 def triplet(n): # Find all the Pythagorean triplets between 1 and n


2 for a in range(n):
3 for b in range(a):
4 for c in range(b):
5 if a*a == b*b + c*c:
6 yield(a, b, c)

We can use this as:

1 triplet_generator = triplet(1000)
2
3 for x in triplet_generator:
4 print(x)
5
6 ------------------------------------------------------------
7 (5, 4, 3)
8 (10, 8, 6)
9 (13, 12, 5)
10 (15, 12, 9)
11 .....

Or, we could also have used a generator expression here:

1 triplet_generator = ((a,b,c) for a in range(1000) for b in range(a) for c in range(b\


2 ) if a*a == b*b + c*c)
3
4 for x in triplet_generator:
5 print(x)
6
7 ------------------------------------------------------------
8 (5, 4, 3)
9 (10, 8, 6)
10 (13, 12, 5)
11 (15, 12, 9)
12 .....

Isn’t Python Beautiful?


Chapter 4: Use Iterators, Generators, and Generator Expressions 28

Conclusion
We must always try to reduce the memory footprint in Python. Iterators and generators provide
us with a way to do that with Lazy evaluation.
How do we choose which one to use? What we can do with generator expressions we could have
done with generators or iterators too.
There is no correct answer here. Whenever I face such a dilemma, I always think in the terms of
functionality vs readability. Generally,
Functionality wise: Iterators>Generators>Generator Expressions.
Readability wise: Iterators<Generators<Generator Expressions.
It is not necessary that you end up using them in your code now. But I guess understanding how these
things work helps mitigate some of the confusion and panic one faces whenever these constructs
come up.
Chapter 5: How and Why to use f
strings in Python3?
Python provides us with many styles of coding. And with time, Python has regularly come up with
new coding standards and tools that adhere even more to the coding standards in the Zen of Python.

Beautiful is better than ugly.

And so this chapter is about using f strings in Python that was introduced in Python 3.6.

3 Common Ways of Printing:


Let me explain this with a simple example. Suppose you have some variables, and you want to print
them within a statement.

1 name = 'Andy'
2 age = 20
3 print(?)
4 ----------------------------------------------------------------
5 Output: I am Andy. I am 20 years old

You can do this in various ways:


a) Concatenate: A very naive way to do is to simply use + for concatenation within the print
function. But that is clumsy. We would need to convert our numeric variables to string and keep
care of the spaces while concatenating. And it doesn’t look good as the code readability suffers a
little when we use it.

1 name = 'Andy'
2 age = 20
3 print("I am " + name + ". I am " + str(age) + " years old")
4 ----------------------------------------------------------------
5 I am Andy. I am 20 years old
Chapter 5: How and Why to use f strings in Python3? 30

b) % Format: The second option is to use % formatting. But it also has its problems. For one, it is not
readable. You would need to look at the first %s and try to find the corresponding variable in the list
at the end. And imagine if you have a long list of variables that you may want to print.

1 print("I am %s. I am %s years old" % (name, age))

c) str.format(): Next comes the way that has been used in most Python 3 codes and has become
the standard of printing in Python. Using str.format()

1 print("I am {}. I am {} years old".format(name, age))

Here we use {} to denote the placeholder of the object in the list. It still has the same problem of
readability, but we can also use str.format :

1 print("I am {name}. I am {age} years old".format(name = name, age = age))

If this seems a little too repetitive, we can use dictionaries too:


Chapter 5: How and Why to use f strings in Python3? 31

1 data = {'name':'Andy','age':20}
2 print("I am {name}. I am {age} years old".format(**data))

The Fourth Way with f

Since Python 3.6, we have a new formatting option, which makes it even more trivial. We could
simply use:

1 print(f"I am {name}. I am {age} years old")

We just append f at the start of the string and use {} to include our variable name, and we get the
required results.
An added functionality that f string provides is that we can put expressions in the {} brackets. For
Example:
Chapter 5: How and Why to use f strings in Python3? 32

1 num1 = 4
2 num2 = 5
3 print(f"The sum of {num1} and {num2} is {num1+num2}.")
4 ---------------------------------------------------------------
5 The sum of 4 and 5 is 9.

This is quite useful as you can use any sort of expression inside these brackets. The expression can
contain dictionaries or functions. A simple example:

1 def totalFruits(apples,oranges):
2 return apples+oranges
3
4 data = {'name':'Andy','age':20}
5
6 apples = 20
7 oranges = 30
8
9 print(f"{data['name']} has {totalFruits(apples,oranges)} fruits")
10 ----------------------------------------------------------------
11 Andy has 50 fruits

Also, you can use ’’’ to use multiline strings.

1 num1 = 4
2 num2 = 5
3 print(f'''The sum of
4 {num1} and
5 {num2} is
6 {num1+num2}.''')
7
8 ---------------------------------------------------------------
9 The sum of
10 4 and
11 5 is
12 9.

An everyday use case while formatting strings is to format floats. You can do that using f string as
following
Chapter 5: How and Why to use f strings in Python3? 33

1 numFloat = 10.23456678
2 print(f'Printing Float with 2 decimals: {numFloat:.2f}')
3
4 -----------------------------------------------------------------
5 Printing Float with 2 decimals: 10.23

Conclusion
Until recently, I had been using Python 2 for all my work, and so was not able to check out this new
feature.
But now, as I am shifting to Python 3, f strings has become my go-to syntax to format strings. It is
easy to write and read with the ability to incorporate arbitrary expressions as well. In a way, this
new function adheres to at least 3 PEP* concepts —

Beautiful is better than ugly, Simple is better than complex and Readability counts.

*https://www.python.org/dev/peps/pep-0020/
Chapter 6: Object Oriented
Programming
Object-Oriented Programming or OOP can be a tough concept to understand for beginners. And
that’s mainly because it is not really explained in the right way in a lot of places. Normally a lot of
books start by explaining OOP by talking about the three big terms — Encapsulation, Inheritance
and Polymorphism. But the time the book can explain these topics, anyone who is just starting
would already feel lost.
So, I thought of making the concept a little easier for fellow programmers, Data Scientists and
Pythonistas. The way I intend to do is by removing all the Jargon and going through some examples.
I would start by explaining classes and objects. Then I would explain why classes are important in
various situations and how they solve some fundamental problems. In this way, the reader would
also be able to understand the three big terms by the end of the chapter.
This chapter is about explaining OOP the laymen way.

What are Objects and Classes?


Put simply, everything in Python is an object and classes are a blueprint of objects. So when we
write:

1 a = 2
2 b = "Hello!"

We are creating an object a of class int holding the value 2 And and object b of class str holding
the value “Hello!”. In a way, these two particular classes are provided to us by default when we use
numbers or strings.
Apart from these a lot of us end up working with classes and objects without even realizing it. For
example, you are actually using a class when you use any scikit Learn Model.

1 clf = RandomForestClassifier()
2 clf.fit(X,y)

Here your classifier clf is an object and fit is a method defined in the class RandomForestClassifier
Chapter 6: Object Oriented Programming 35

But Why Classes?


So, we use them a lot when we are working with Python. But why really. What is it with classes? I
could do the same with functions?
Yes, you can. But classes really provide you with a lot of power compared to functions. To quote
an example, the str class has a lot of functions defined for the object which we can access just by
pressing tab. One could also write all these functions, but that way, they would not be available to
use just by pressing the tab button.
Chapter 6: Object Oriented Programming 36

This property of classes is called encapsulation. From Wikipedia* — encapsulation refers to the
bundling of data with the methods that operate on that data, or the restricting of direct access to
some of an object’s components.
So here the str class bundles the data(“Hello!”) with all the methods that would operate on our data.
I would explain the second part of that statement by the end of the chapter. In the same way, the
*https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
Chapter 6: Object Oriented Programming 37

RandomForestClassifier class bundles all the classifier methods(fit, predict etc.)

Apart from this, Class usage can also help us to make the code much more modular and easy to
maintain. So say we were to create a library like Scikit-Learn. We need to create many models, and
each model will have a fit and predict method. If we don’t use classes, we will end up with a lot
of functions for each of our different models like:

1 RFCFit
2 RFCPredict
3 SVCFit
4 SVCPredict
5 LRFit
6 LRPredict
7
8 and so on.

This sort of a code structure is just a nightmare to work with, and hence Scikit-Learn defines each
of the models as a class having the fit and predict methods.

Creating a Class
So, now we understand why to use classes and how they are so important, how do we really go
about using them? So, creating a class is pretty simple. Below is a boilerplate code for any class you
will end up writing:

1 class myClass:
2 def __init__(self, a, b):
3 self.a = a
4 self.b = b
5
6 def somefunc(self, arg1, arg2):
7 #SOME CODE HERE

We see a lot of new keywords here. The main ones are class,__init__ and self. So what are these?
Again, it is easily explained by some example.
Suppose you are working at a bank that has many accounts. We can create a class named Account
that would be used to work with any account. For example, below I create an elementary toy class
Account which stores data for a user — namely account_name and balance. It also provides us with
two methods to deposit/withdraw money to/from the bank account. Do read through it. It follows
the same structure as the code above.
Chapter 6: Object Oriented Programming 38

1 class Account:
2 def __init__(self, account_name, balance=0):
3 self.account_name = account_name
4 self.balance = balance
5
6 def deposit(self, amount):
7 self.balance += amount
8
9 def withdraw(self,amount):
10 if amount <= self.balance:
11 self.balance -= amount
12 else:
13 print("Cannot Withdraw amounts as no funds!!!")

We can create an account with a name Rahul and having an amount of 100 using:

1 myAccount = Account("Rahul",100)

We can access the data for this account using:

But, how are these attributes balance and account_name already set to 100, and “Rahul” respectively?
We never did call the __init__ method, so why did the object gets these attribute? The answer here
is that __init__ is a magic method(There are a lot of other magic methods which I would expand
on in my next chapter on Magic Methods), which gets run whenever we create the object. So when
we create myAccount , it automatically also runs the function __init__
So now we understand __init__, let us try to deposit some money into our account. We can do this
by:

And our balance rose to 200. But did you notice that our function deposit needed two arguments
namely self and amount, yet we only provided one, and still, it works.
So, what is this self? The way I like to explain self is by calling the same function in an albeit
different way. Below, I call the same function deposit belonging to the class account and provide it
with the myAccount object and the amount. And now the function takes two arguments as it should.
Chapter 6: Object Oriented Programming 39

And our myAccount balance increases by 100 as expected. So it is the same function we have
called. Now, that could only happen if self and myAccount are exactly the same object. When
I call myAccount.deposit(100) Python provides the same object myAccount to the function call
as the argument self. And that is why self.balance in the function definition really refers to
myAccount.balance.

But, still, some problems remain


We know how to create classes, but still, there is another important problem that I haven’t touched
upon yet.
So, suppose you are working with Apple iPhone Division, and you have to create a different Class
for each iPhone model. For this simple example, let us say that our iPhone’s first version currently
does a single thing only — Makes a call and has some memory. We can write the class as:

1 class iPhone:
2 def __init__(self, memory, user_id):
3 self.memory = memory
4 self.mobile_id = user_id
5 def call(self, contactNum):
6 # Some Implementation Here

Now, Apple plans to launch iPhone1 and this iPhone Model introduces a new functionality — The
ability to take a pic. One way to do this is to copy-paste the above code and create a new class
iPhone1 like:

1 class iPhone1:
2 def __init__(self, memory, user_id):
3 self.memory = memory
4 self.mobile_id = user_id
5 self.pics = []
6
7 def call(self, contactNum):
8 # Some Implementation Here
9
Chapter 6: Object Oriented Programming 40

10 def click_pic(self):
11 # Some Implementation Here
12 pic_taken = ...
13 self.pics.append(pic_taken)

But as you can see that is a lot of unnecessary duplication of code and Python has a solution for
removing that code duplication. One good way to write our iPhone1 class is:

1 class iPhone1(**iPhone**):
2 def __init__(self,**memory,user_id**):
3 **super().__init__(memory,user_id)**
4 self.pics = []
5 def click_pic(self):
6 # Some Implementation Here
7 pic_taken = ...
8 self.pics.append(pic_taken)

And that is the concept of inheritance. As per Wikipedia*: Inheritance is the mechanism of basing
an object or class upon another object or class retaining similar implementation. Simply put, iPhone1
has access to all the variables and methods defined in class iPhone now.
In this case, we don’t have to do any code duplication as we have inherited(taken) all the methods
from our parent class iPhone. Thus we don’t have to define the call function again. Also, we don’t
set the mobile_id and memory in the __init__ function using super.
But what is this super().__init__(memory,user_id)?
In real life, your __init__ functions won’t be these nice two-line functions. You would need to
define a lot of variables/attributes in your class and copying pasting them for the child class (here
iPhone1) becomes cumbersome. Thus there exists super(). Here super().__init__() actually calls
the __init__ method of the parent iPhone Class here. So here when the __init__ function of class
iPhone1 runs it automatically sets the memory and user_id of the class using the __init__ function
of the parent class.
Where do we see this in ML/DS/DL? Below is how we create a PyTorch† model. This model
inherits everything from the nn.Module class and calls the __init__ function of that class using the
super call.

*https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)
†https://mlwhiz.com/blog/2020/09/09/pytorch_guide/
Chapter 6: Object Oriented Programming 41

1 class myNeuralNet(nn.Module):
2
3 def __init__(self):
4 **super().__init__()**
5 # Define all your Layers Here
6 self.lin1 = nn.Linear(784, 30)
7 self.lin2 = nn.Linear(30, 10)
8
9 def forward(self, x):
10 # Connect the layer Outputs here to define the forward pass
11 x = self.lin1(x)
12 x = self.lin2(x)
13 return x

But what is Polymorphism? We are getting better at understanding how classes work so I guess I
would try to explain Polymorphism now. Look at the below class.

1 import math
2
3 class Shape:
4 def __init__(self, name):
5 self.name = name
6
7 def area(self):
8 pass
9
10 def getName(self):
11 return self.name
12
13 class Rectangle(Shape):
14 def __init__(self, name, length, breadth):
15 super().__init__(name)
16 self.length = length
17 self.breadth = breadth
18
19 def area(self):
20 return self.length*self.breadth
21
22 class Square(Rectangle):
23 def __init__(self, name, side):
24 super().__init__(name,side,side)
25
26 class Circle(Shape):
Chapter 6: Object Oriented Programming 42

27 def __init__(self, name, radius):


28 super().__init__(name)
29 self.radius = radius
30
31 def area(self):
32 return pi*self.radius**2

Here we have our base class Shape and the other derived classes — Rectangle and Circle. Also, see
how we use multiple levels of inheritance in the Square class which is derived from Rectangle which
in turn is derived from Shape. Each of these classes has a function called area which is defined as
per the shape. So the concept that a function with the same name can do multiple things is made
possible through Polymorphism in Python. In fact, that is the literal meaning of Polymorphism:
“Something that takes many forms”. So here our function area takes multiple forms.
Another way that Polymorphism works with Python is with the isinstance method. So using the
above class, if we do:

Thus, the instance type of the object mySquare is Square, Rectangle and Shape. And hence the
object is polymorphic. This has a lot of good properties. For example, We can create a function
that works with an Shape object, and it will totally work with any of the derived classes(Square,
Circle, Rectangle etc.) by making use of Polymorphism.
Chapter 6: Object Oriented Programming 43

Some More Info:


Why do we see function names or attribute names starting with Single and Double Underscores?
Sometimes we want to make our attributes and functions in classes private and not allow the
user to see them. This is a part of Encapsulation where we want to “restrict the direct access to
some of an object’s components”. For instance, let’s say, we don’t want to allow the user to see
the memory(RAM) of our iPhone once it is created. In such cases, we create an attribute using
underscores in variable names.
So when we create the iPhone Class in the below way, you won’t be able to access your phone memory
or the privatefunc using Tab in your ipython notebooks because the attribute is made private now
using _.

But you would still be able to change the variable value using(Though not recommended),

You would also be able to use the method _privatefunc using myphone._privatefunc(). If you want
to avoid that you can use double underscores in front of the variable name. For example, below the
call to print(myphone.__memory) throws an error. Also, you are not able to change the internal data
of an object by using myphone.__memory = 1.
Chapter 6: Object Oriented Programming 44

But, as you see you can access and modify these self.__memory values in your class definition in
the function setMemory for instance.

Conclusion
I hope this has been useful for you to understand classes. So, to summarize, in this chapter, we
learned about OOP and creating classes along with the various fundamentals of OOP:

• Encapsulation: Object contains all the data for itself.


• Inheritance: We can create a class hierarchy where methods from parent classes pass on to
child classes
• Polymorphism: A function takes many forms, or the object might have multiple types.

To end this chapter, I would be giving an exercise for you to implement as I think it might clear
some concepts for you. Create a class that lets you manage 3d objects(sphere and cube) with
volumes and surface areas. The basic boilerplate code is given below:
Chapter 6: Object Oriented Programming 45

1 import math
2
3 class Shape3d:
4 def __init__(self, name):
5 self.name = name
6
7 def surfaceArea(self):
8 pass
9
10 def volume(self):
11 pass
12
13 def getName(self):
14 return self.name
15
16 class Cuboid():
17 pass
18
19 class Cube():
20 pass
21
22 class Sphere():
23 pass

The Answer is:

1 import math
2
3 class Shape3d:
4 def __init__(self, name):
5 self.name = name
6
7 def surfaceArea(self):
8 pass
9
10 def volume(self):
11 pass
12
13 def getName(self):
14 return self.name
Chapter 6: Object Oriented Programming 46

15
16 class Cuboid(Shape3d):
17 def __init__(self, name, length, breadth,height):
18 super().__init__(name)
19 self.length = length
20 self.breadth = breadth
21 self.height = height
22
23 def surfaceArea(self):
24 return 2*(self.length*self.breadth + self.breadth*self.height + self.length*\
25 self.height)
26
27 def volume(self):
28 return self.length*self.breadth*self.height
29
30 class Cube(Cuboid):
31 def __init__(self, name, side):
32 super().__init__(name,side,side,side)
33
34 class Sphere(Shape3d):
35 def __init__(self, name, radius):
36 super().__init__(name)
37 self.radius = radius
38
39 def surfaceArea(self):
40 return 4*pi*self.radius**2
41
42 def volume(self):
43 return 4/3*pi*self.radius**3
Chapter 7: Dunder Methods
In my last chapter, I talked about Object-Oriented Programming(OOP). And I specifically talked
about a single magic method __init__ which is also called as a constructor method in the OOP
terminology.
The magic part of __init__ is that it gets called whenever an object is created automatically. But it
is not the only one in any sense. Python provides us with many other magic methods that we end
up using without even knowing about them. Ever used len(), print() or the [] operator on a list?
You have been using Dunder methods.
In this chapter, I would talk about five of the most used magic functions or “Dunder” methods.

1. Operator Magic methods: __add__, __sub__, __mul__,


__truediv__, __lt__,__gt__

In my last chapter, I talked about how everything ranging from int,str ,float in Python is an object.
We also learned how we could call methods on an object like

1 Fname = "Rahul"
2
3 # Now, we can use various methods defined in the string class using the below syntax
4
5 Fname.lower()

But, as we know, we can also use the + operator to concatenate multiple strings.

1 Lname = "Agarwal"
2 print(Fname + Lname)
3 ------------------------------------------------------------------
4 RahulAgarwal

So, Why does the addition operator work? I mean how does the String object knows what to do
when it encounters the plus sign? How do you write it in the str class? And the same + operation
happens differently in the case of integer objects. Thus the operator + behaves differently in the
case of String and Integer(Fancy people call this — operator overloading).
Chapter 7: Dunder Methods 48

So, can we add any two objects? Let’s try to add two objects from our elementary Account class.

It fails as expected since the operand + is not supported for an object of type Account. But we
can add the support of + to our Account class using our magic method __add__

1 class Account:
2 def __init__(self, account_name, balance=0):
3 self.account_name = account_name
4 self.balance = balance
5 def __add__(self,acc):
6 if isinstance(acc,Account):
7 return self.balance + acc.balance
8 raise Exception(f"{acc} is not of class Account")

Here we add a magic method __add__ to our class which takes two arguments — self and acc. We
first check if acc is of class account and if it is, we return the sum of balances when we add these
accounts. If we add anything else to an account other than an object from the Account class, we
would be shown a descriptive error. Let us try it:
Chapter 7: Dunder Methods 49

So, we can add any two objects. In fact, we also have different magic methods for a variety of other
operators.

• __sub__ for subtraction(-)


• __mul__ for multiplication(*)
• __truediv__ for division(/)
• __eq__ for equality (==)
• __lt__ for less than(<)
• __gt__ for greater than(>)
• __le__ for less than or equal to (≤)
• __ge__ for greater than or equal to (≥)

As a running example, I will try to explain all these concepts by creating a class called Complex
to handle complex numbers. Don’t worry Complex would just be the class’s name, and I would try
to keep it as simple as possible.
So below, I create a simple method __add__ that adds two Complex numbers, or a complex number
and an int/float. It first checks if the number being added is of type int or float or Complex. Based
on the type of number being added, we do the required addition. We also use the isinstance function
to check the type of the other object. Do read the comments.
Chapter 7: Dunder Methods 50

1 import math
2 class Complex:
3 def __init__(self, re=0, im=0):
4 self.re = re
5 self.im = im
6 def __add__(self, other):
7 # If Int or Float Added, return a Complex number where float/int is added to\
8 the real part
9 if isinstance(other, int) or isinstance(other, float):
10 return Complex(self.re + other,self.im)
11 # If Complex Number added return a new complex number having a real and comp\
12 lex part
13 elif isinstance(other, Complex):
14 return Complex(self.re + other.re , self.im + other.im)
15 else:
16 raise TypeError

It can be used as:

1 a = Complex(3,4)
2 b = Complex(4,5)
3 print(a+b)

You would now be able to understand the below code which allows us to add, subtract, multiply and
divide complex numbers with themselves as well as scalars(float, int etc.). See how these methods,
in turn, return a complex number. The below code also provides the functionality to compare two
complex numbers using __eq__,__lt__,__gt__
You really don’t need to understand all of the complex number maths, but I have tried to use most
of these magic methods in this particular class. Maybe read through __add__ and __eq__ one.

1 import math
2 class Complex:
3 def __init__(self, re=0, im=0):
4 self.re = re
5 self.im = im
6
7 def __add__(self, other):
8 if isinstance(other, int) or isinstance(other, float):
9 return Complex(self.re + other,self.im)
10 elif isinstance(other, Complex):
11 return Complex(self.re + other.re , self.im + other.im)
12 else:
Chapter 7: Dunder Methods 51

13 raise TypeError
14
15 def __sub__(self, other):
16 if isinstance(other, int) or isinstance(other, float):
17 return Complex(self.re - other,self.im)
18 elif isinstance(other, Complex):
19 return Complex(self.re - other.re, self.im - other.im)
20 else:
21 raise TypeError
22
23 def __mul__(self, other):
24 if isinstance(other, int) or isinstance(other, float):
25 return Complex(self.re * other, self.im * other)
26 elif isinstance(other, Complex):
27 # (a+bi)*(c+di) = ac + adi +bic -bd
28 return Complex(self.re * other.re - self.im * other.im,
29 self.re * other.im + self.im * other.re)
30 else:
31 raise TypeError
32
33 def __truediv__(self, other):
34 if isinstance(other, int) or isinstance(other, float):
35 return Complex(self.re / other, self.im / other)
36 elif isinstance(other, Complex):
37 x = other.re
38 y = other.im
39 u = self.re
40 v = self.im
41 repart = 1/(x**2+y**2)*(u*x + v*y)
42 impart = 1/(x**2+y**2)*(v*x - u*y)
43 return Complex(repart,impart)
44 else:
45 raise TypeError
46
47 def value(self):
48 return math.sqrt(self.re**2 + self.im**2)
49
50 def __eq__(self, other):
51 if isinstance(other, int) or isinstance(other, float):
52 return self.value() == other
53 elif isinstance(other, Complex):
54 return self.value() == other.value()
55 else:
Chapter 7: Dunder Methods 52

56 raise TypeError
57
58 def __lt__(self, other):
59 if isinstance(other, int) or isinstance(other, float):
60 return self.value() < other
61 elif isinstance(other, Complex):
62 return self.value() < other.value()
63 else:
64 raise TypeError
65
66 def __gt__(self, other):
67 if isinstance(other, int) or isinstance(other, float):
68 return self.value() > other
69 elif isinstance(other, Complex):
70 return self.value() > other.value()
71 else:
72 raise TypeError

Now we can use our Complex class as:

2. But why does the complex number print as this


random string? — __str__ and __repr__
Ahh! You got me. This brings us to another dunder method which lets us use the print method
on our object called __str__. The main idea again is that when we call print(object) it calls the
__str__ method in the object. Here is how we can use that with our Complex class.
Chapter 7: Dunder Methods 53

1 class Complex:
2 def __init__(self, re=0, im=0):
3 self.re = re
4 self.im = im
5
6 .....
7 .....
8
9 def __str__(self):
10 if self.im>=0:
11 return f"{self.re}+{self.im}i"
12 else:
13 return f"{self.re}{self.im}i"

We can now recheck the output:

So now, our object gets printed in a better way. But still, if we try to do the below in our notebook,
the __str__ method is not called:

This is because we are not printing in the above code, and thus the __str__ method doesn’t get
called. In this case, another magic method called __repr__ gets called. We can just add this in our
class to get the same result as a print(we could also have implemented it differently). It’s a dunder
method inside a dunder method. Pretty nice!!!

1 def __repr__(self):
2 return self.__str__()
Chapter 7: Dunder Methods 54

3. I am getting you. So, Is that how the len method


works too? __len__
len() is another function that works pretty much with strings, Lists and matrices, and whatnot. To
use this function with our Complex numbers class, we can use the __len__ magic method though
really not a valid use case for complex numbers in this case as the return type of __len__ needs to
be an int as per the documentation.

1 class Complex:
2 def __init__(self, re=0, im=0):
3 self.re = re
4 self.im = im
5
6 ......
7 ......
8
9 def __len__(self):
10 # This function return type needs to be an int
11 return int(math.sqrt(self.re**2 + self.im**2))

Here is its usage:

4. But what about the Assignment operations?


We know how the + operator works with an object. But have you thought why does the below +=
work? For example

1 myStr = "This blog"


2 otherStr = " is awesome"
3
4 myStr+=otherStr
5 print(myStr)
Chapter 7: Dunder Methods 55

This brings us to another set of dunder methods called assignment dunder methods which include
__iadd__,__isub__,__imul__,__itruediv__ and many others.

So If we just add the following method __iadd__ to our class, we would be able to make assignment
based additions too.

1 class Complex:
2 .....
3 def __iadd__(self, other):
4 if isinstance(other, int) or isinstance(other, float):
5 return Complex(self.re + other,self.im)
6 elif isinstance(other, Complex):
7 return Complex(self.re + other.re , self.im + other.im)
8 else:
9 raise TypeError

And use it as:

5. Can your class object support indexing?


Sometimes objects might contain lists, and we might need to index the object to get the value from
the list. To understand this, let’s take a different example. Imagine you are a company that helps
users trade stock. Each user will have a Daily Transaction Book that will contain information about
the user’s trades/transactions over the course of the day. We can implement such a class by:

1 class TrasnsactionBook:
2 def __init__(self, user_id, shares=[]):
3 self.user_id = user_id
4 self.shares = shares
5 def add_trade(self, name , quantity, buySell):
6 self.shares.append([name,quantity,buySell])
7 def __getitem__(self, i):
8 return self.shares[i]
Chapter 7: Dunder Methods 56

Do you notice the __getitem__ here? This actually allows us to use indexing on objects of this
particular class using:

We can get the first trade done by the user or the second one based on the index we use. This is
just a simple example, but you can set your object up to get a lot more information when you use
indexing.

Conclusion
Python is a magical language, and there are many constructs in Python that even advanced users
may not know about. Dunder methods might be very well one of them. I hope with this chapter,
you get a good glimpse of various dunder methods that Python offers and also understand how to
implement them yourself. If you want to know about even more dunder methods, do take a look at
this blog* from Rafe Kettler.
*https://rszalski.github.io/magicmethods/#comparisons
Chapter 8: Regexes Everywhere!!!
One of the main tasks while working with text data is to create a lot of text-based features. One could
like to find out certain patterns in the text, emails if present in a text as well as phone numbers in a
large text.
While it may sound fairly trivial to achieve such functionalities it is much simpler if we use the
power of Python’s regex module.
For example, let’s say you are tasked with finding the number of punctuations in a particular piece
of text. Using text from Dickens here. How do you normally go about it?
A simple enough way is to do something like:

1 target = [';','.',',','–']
2
3 string = "It was the best of times, it was the worst of times, it was the age of wis\
4 dom, it was the age of foolishness, it was the epoch of belief, it was the epoch of
5 incredulity, it was the season of Light, it was the season of Darkness, it was the s
6 pring of hope, it was the winter of despair, we had everything before us, we had not
7 hing before us, we were all going direct to Heaven, we were all going direct the oth
8 er way – in short, the period was so far like the present period, that some of its n
9 oisiest authorities insisted on its being received, for good or for evil, in the sup
10 erlative degree of comparison only.**"
11
12 num_puncts = 0
13 for punct in target:
14 if punct in string:
15 num_puncts+=string.count(punct)
16
17 print(num_puncts)

1 19

And that is all but fine if we didn’t have the re module at our disposal. With re it is simply 2 lines
of code:
Chapter 8: Regexes Everywhere!!! 58

1 import re
2 pattern = r"[;.,–]"
3 print(len(re.findall(pattern,string)))

1 19

This chapter is about one of the most commonly used regex patterns and some regex
functions I end up using regularly. If you would like to see a more interactive version of
this chapter you can check it on my blog*

What is regex?
In simpler terms, a regular expression(regex) is used to find patterns in a given string.
The pattern we want to find could be anything.
We can create patterns that resemble an email or a mobile number. We can create patterns that find
out words that start with a and ends with z from a string.
In the above example:

1 import re
2
3 pattern = r'[,;.,–]'
4 print(len(re.findall(pattern,string)))

The pattern we wanted to find out was [,;.,–]. This pattern captures any of the 4 characters we
wanted to capture. I find regex101† a great tool for testing patterns. This is how the pattern looks
when applied to the target string.
*https://mlwhiz.com/blog/2019/09/01/regex/
†https://regex101.com/
Chapter 8: Regexes Everywhere!!! 59

As we can see we are able to find all the occurrences of ,;.,– in the target string as required.
I use the above tool whenever I need to test a regex. Much faster than running a python program
again and again and much easier to debug.
So now we know that we can find patterns in a target string but how do we really create these
patterns?
Chapter 8: Regexes Everywhere!!! 60

Creating Patterns

The first thing we need to learn while using regex is how to create patterns.
I will go through some most commonly used patterns one by one.
As you would think, the simplest pattern is a simple string.

1 pattern = r'times'
2 string = "It was the best of times, it was the worst of times."
3 print(len(re.findall(pattern,string)))

But that is not very useful. To help with creating complex patterns regex provides us with special
characters/operators.

1. the [] operator
This is the one we used in our first example. We want to find one instance of any character
within these square brackets.
[abc]- will find all occurrences of a or b or c.

[a-z]- will find all occurrences of a to z.

[a-z0–9A-Z]- will find all occurrences of a to z, A to Z and 0 to 9.


Chapter 8: Regexes Everywhere!!! 61

We can easily use this pattern as below in Python:

1 pattern = r'[a-zA-Z]'
2 string = "It was the best of times, it was the worst of times."
3 print(len(re.findall(pattern,string)))

There are other functionalities in regex apart from .findall but we will get to them a little bit later.

2. The dot Operator


The dot operator(.) is used to match a single instance of any character except the newline
character.
The best part about the operators is that we can use them in conjunction with one another.
For example, We want to find out the substrings in the string that start with small d or Capital D
and end with e with a length of 6.
Chapter 8: Regexes Everywhere!!! 62

3. Some Meta Sequences


There are some patterns that we end up using again and again while using regex. And so regex has
created a few shortcuts for them. The most useful shortcuts are:
\w, Matches any letter, digit or underscore. Equivalent to [a-zA-Z0–9_]

\W, Matches anything other than a letter, digit or underscore.

\d, Matches any decimal digit. Equivalent to [0–9].

\D, Matches anything other than a decimal digit.


Chapter 8: Regexes Everywhere!!! 63

4. The Plus and Star operator

The dot character is used to get a single instance of any character. What if we want to find more.
The Plus character +, is used to signify 1 or more instance of the leftmost character.
The Star character *, is used to signify 0 or more instance of the leftmost character.
For example, if we want to find out all substrings that start with d and end with e, we can have zero
characters or more characters between d and e. We can use: d\w*e
Chapter 8: Regexes Everywhere!!! 64

If we want to find out all substrings that start with d and end with e with at least one character
between d and e, we can use: d\w+e

We could also have used a more generic approach using {}


\w{n} - Repeat \w exactly n number of times.
Chapter 8: Regexes Everywhere!!! 65

\w{n,} - Repeat \w at least n times or more.

\w{n1, n2} - Repeat \w at least n1 times but no more than n2 times.

5. ^ Caret Operator and $ Dollar operator.


^ Matches the start of a string, and $ Matches the end of the string.

6. Word Boundary
This is an important concept. Did you notice how I always matched substring and never a word in
the above examples?
So, what if we want to find all words that start with d? We use word boundary using \b which
highlights the start of the word for us.
Can we use d\w* as the pattern? Let’s see using the web tool.
Chapter 8: Regexes Everywhere!!! 66

Regex Functions
Till now we have only used the findall function from the re package, but it also supports a lot more
functions. Let us look into the functions one by one.

1. findall
We already have used findall. It is one of the regex functions I end up using most often. Let us
understand it a little more formally.
Input: Pattern and test string
Output: List of strings.

1 #USAGE:
2
3 pattern = r'[iI]t'
4 string = "It was the best of times, it was the worst of times."
5
6 matches = re.findall(pattern,string)
7
8 for match in matches:
9 print(match)

1 It
2 it
Chapter 8: Regexes Everywhere!!! 67

2. Search

Input: Pattern and test string


Output: Location object for the first match.

1 #USAGE:
2
3 pattern = r'[iI]t'
4 string = "It was the best of times, it was the worst of times."
5
6 location = re.search(pattern,string)
7 print(location)

1 <_sre.SRE_Match object; span=(0, 2), match='It'>

We can get this location object’s data using


Chapter 8: Regexes Everywhere!!! 68

1 print(location.group())

1 'It'

3. Substitute
This is another great functionality. When you work with NLP you sometimes need to substitute
integers with X’s. Or you might need to redact some document. Just the basic find and replace in
any of the text editors.
Input: search pattern, replacement pattern, and the target string
Output: Substituted string

1 string = "It was the best of times, it was the worst of times."
2 string = re.sub(r'times', r'life', string)
3 print(string)

1 It was the best of life, it was the worst of life.

Some Case Studies:


Regex is used in many cases when validation is required. You might have seen prompts on websites
like “This is not a valid email address”. While such a prompt could be written using multiple if and
else conditions, regex is probably the best for such use cases.
Chapter 8: Regexes Everywhere!!! 69

1. PAN Numbers

In India, we have got PAN Numbers for Tax identification rather than SSN numbers in the US. The
basic validation criteria for PAN is that it must have all its letters in uppercase and characters in the
following order:

1 <char><char><char><char><char><digit><digit><digit><digit><char>

So the question is:


Is ‘ABcDE1234L’ a valid PAN?
How would you normally attempt to solve this without regex? You will most probably write a for
loop and keep an index going through the string. With regex it is as simple as below:

1 match=re.search(r'[A-Z]{5}[0–9]{4}[A-Z]','ABcDE1234L')
2 if match:
3 print(True)
4 else:
5 print(False)

1 False
Chapter 8: Regexes Everywhere!!! 70

2. Find Domain Names

Sometimes we have got a large text document and we have got to find out instances of telephone
numbers or email IDs or domain names from the big text document.
For example, Suppose you have this text:

1 <div class="reflist" style="list-style-type: decimal;">


2 <ol class="references">
3 <li id="cite_note-1"><span class="mw-cite-backlink"><b>^ ["Train (noun)"](http://www\
4 .askoxford.com/concise_oed/train?view=uk). <i>(definition – Compact OED)</i>. Oxford
5 University Press<span class="reference-accessdate">. Retrieved 2008-03-18</span>.</
6 span><span title="ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fen.wikipedia.org%3ATrain&r
7 ft.atitle=Train+%28noun%29&rft.genre=article&rft_id=https%3A%2F%2Ffanyv88.com%3A443%2Fhttp%2Fwww.askoxford.com%2
8 Fconcise_oed%2Ftrain%3Fview%3Duk&rft.jtitle=%28definition+%E2%80%93+Compact+OED%29&r
9 ft.pub=Oxford+University+Press&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal" c
10 lass="Z3988"><span style="display:none;"> </span></span></span></li>
11 <li id="cite_note-2"><span class="mw-cite-backlink"><b>^</b></span> <span class="ref\
12 erence-text"><span class="citation book">Atchison, Topeka and Santa Fe Railway (1948
13 ). <i>Rules: Operating Department</i>. p. 7.</span><span title="ctx_ver=Z39.88-2004&
14 rfr_id=info%3Asid%2Fen.wikipedia.org%3ATrain&rft.au=Atchison%2C+Topeka+and+Santa+Fe+
15 Railway&rft.aulast=Atchison%2C+Topeka+and+Santa+Fe+Railway&rft.btitle=Rules%3A+Opera
16 ting+Department&rft.date=1948&rft.genre=book&rft.pages=7&rft_val_fmt=info%3Aofi%2Ffm
Chapter 8: Regexes Everywhere!!! 71

17 t%3Akev%3Amtx%3Abook" class="Z3988"><span style="display:none;"> </span></span></spa


18 n></li>
19 <li id="cite_note-3"><span class="mw-cite-backlink"><b>^ [Hydrogen trains](http://ww\
20 w.hydrogencarsnow.com/blog2/index.php/hydrogen-vehicles/i-hear-the-hydrogen-train-a-
21 comin-its-rolling-round-the-bend/)</span></li>
22 <li id="cite_note-4"><span class="mw-cite-backlink"><b>^ [Vehicle Projects Inc. Fuel\
23 cell locomotive](http://www.bnsf.com/media/news/articles/2008/01/2008-01-09a.html)<
24 /span></li>
25 <li id="cite_note-5"><span class="mw-cite-backlink"><b>^</b></span> <span class="ref\
26 erence-text"><span class="citation book">Central Japan Railway (2006). <i>Central Ja
27 pan Railway Data Book 2006</i>. p. 16.</span><span title="ctx_ver=Z39.88-2004&rfr_id
28 =info%3Asid%2Fen.wikipedia.org%3ATrain&rft.au=Central+Japan+Railway&rft.aulast=Centr
29 al+Japan+Railway&rft.btitle=Central+Japan+Railway+Data+Book+2006&rft.date=2006&rft.g
30 enre=book&rft.pages=16&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook" class="Z3988
31 "><span style="display:none;"> </span></span></span></li>
32 <li id="cite_note-6"><span class="mw-cite-backlink"><b>^ ["Overview Of the existing \
33 Mumbai Suburban Railway"](http://web.archive.org/web/20080620033027/http://www.mrvc.
34 indianrail.gov.in/overview.htm). _Official webpage of Mumbai Railway Vikas Corporati
35 on_. Archived from [the original](http://www.mrvc.indianrail.gov.in/overview.htm) on
36 2008-06-20<span class="reference-accessdate">. Retrieved 2008-12-11</span>.</span><
37 span title="ctx_ver=Z39.88-2004&rfr_id=info%3Asid%2Fen.wikipedia.org%3ATrain&rft.ati
38 tle=Overview+Of+the+existing+Mumbai+Suburban+Railway&rft.genre=article&rft_id=http%3
39 A%2F%2Fwww.mrvc.indianrail.gov.in%2Foverview.htm&rft.jtitle=Official+webpage+of+Mumb
40 ai+Railway+Vikas+Corporation&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal" cla
41 ss="Z3988"><span style="display:none;"> </span></span></span></li>
42 </ol>
43 </div>

And you need to find out all the primary domains from this text- askoxford.com; bnsf.com;
hydrogencarsnow.com; mrvc.indianrail.gov.in; web.archive.org

How would you do this?

1 match=re.findall(r'http(s:|:)\/\/([www.|ww2.|)([0-9a-z.A-Z-]*\.\w{2,3})',string)](ht\
2 tp://www.|ww2.|)([0-9a-z.A-Z-]*\.\w{2,3})',string))
3 for elem in match:
4 print(elem)
Chapter 8: Regexes Everywhere!!! 72

1 (':', 'www.', 'askoxford.com')


2 (':', 'www.', 'hydrogencarsnow.com')
3 (':', 'www.', 'bnsf.com')
4 (':', '', 'web.archive.org')
5 (':', 'www.', 'mrvc.indianrail.gov.in')
6 (':', 'www.', 'mrvc.indianrail.gov.in')

| is the or operator here and match returns tuples where the pattern part inside () is kept.

3. Find Email Addresses:

Below is a regex to find email addresses in a long text.

1 match=re.findall(r'([\w0-9-._]+@[\w0-9-.]+[\w0-9]{2,3})',string)

These are advanced examples but if you try to understand these examples for yourself you should
be fine with the info provided.
Chapter 8: Regexes Everywhere!!! 73

Conclusion

While it might look a little daunting at first, regex provides a great degree of flexibility when it
comes to data manipulation, creating features and finding patterns.
I use it quite regularly when I work with text data and it can also be included while working on data
validation tasks.
I am also a fan of the regex101 tool and use it frequently to check my regexes. I wonder if I would
be using regexes as much if not for this awesome tool.
Chapter 9: Type Annotations with
Python
Till now, we have been using Python without specifying types of variables. For example, we would
set a variable as a=10 and not int a =10. In more technical jargon, we call this as Python being a
dynamically typed language, meaning that the variable type is determined at runtime based on the
value it is assigned. So, if a variable is assigned a value of 10, Python will interpret it as an integer,
while if a variable is assigned a value of 10.5 Python will consider it as a float, and so on.
For most purposes, it works great and allows for great flexibility and simplicity in writing Python
programs. But, it can make it challenging to understand the behavior of a program, especially when
dealing with complex or large codebases.
To address this issue, Python introduced type annotations in version 3.5, which allows developers
to specify the type of a variable, function, or class in the source code.
In this chapter, we will take a closer look at type annotations in Python and explain how they are
used to improve the readability and reliability of your Python programs.

How do I use them?


First of all, let’s try to understand how to use them. Type annotations in Python can be specified
simply by using a colon (:) followed by the type of the variable or function being annotated. For
instance, to annotate the type of a variable, you can write:

1 x: int = 5

In this example, the x variable is annotated with the int type, which indicates that it should contain
an integer value. The type annotation comes after the variable name and before the assignment
operator (=), and it does not affect the value assigned to the variable. Please note that the variable
type is still determined at runtime based on the value assigned to it, but the type annotation
provides additional information about the expected type of the variable.
Similarly, to annotate the types of a function, you can write:
Chapter 9: Type Annotations with Python 75

1 def add(x: int, y: int) -> int:


2 return x + y

Here, we specify the types of parameters (x and y) and the type of return value of the add function.
The return type is specified using an arrow symbol.
There are several other ways to use type annotations in Python:

1. Class attributes
You can specify the type of a class attribute using a type annotation. For example:

1 class Person:
2 name: str
3 age: int

In this example, the name attribute is a string, and the age attribute is an integer.

2. List, tuple, and dictionary elements


You can use type annotations to specify the expected type of elements in a list, tuple, or dictionary
using the typing module.

1 from typing import List, Tuple, Dict


2
3 names: List[str] = ['Alice', 'Bob', 'Charlie']
4 coordinates: Tuple[float, float] = (0.0, 0.0)
5 scores: Dict[str, int] = {'Alice': 85, 'Bob': 90, 'Charlie': 95}

But why even use Type Annotations?


There are a few reasons why you might want to use type annotations in your Python code:

1. Improved readability
Type annotations can improve the readability of your code by providing explicit information about
the expected type of a variable or function. This can make it easier for other developers (or even
yourself) to understand the behavior of a program, especially when dealing with complex or large
codebases. For instance, consider the following code without type annotations:
Chapter 9: Type Annotations with Python 76

1 def add(x, y):


2 return x + y

In this example, the add function takes two arguments, x and y, and returns their sum. However,
without type annotations, it is not clear what type of values the x and y arguments should have, or
what type of value the add function will return. This can lead to confusion or errors, especially if
the function is called with arguments of a different type than the ones it expects.
To address this issue, we can add type annotations to the add function as follows:

1 def add(x: int, y: int) -> int:


2 return x + y

In this example, the add function is annotated with the types of its parameters (x and y) and its return
value (int). This makes it clear that the add function expects two integer arguments and returns an
integer value.

2. Static type checking


Static type checking is the process of verifying the types of variables and expressions in a program
without actually running the code. This can be useful for finding type errors before the code is even
run, saving you time and effort in the development process.
To use static type checking in Python, you can use mypy. mypy is a static type checker for Python that
can analyze your code and check for any type errors.
To use mypy, you will need to install it and then run it on your code. mypy analyzes your code and
reports any type of errors that it finds. For example, if you have a function that expects an integer
argument but is called with a string, mypy will report a type error.

1 def add(x: int, y: int) -> int:


2 return x + y
3 add(1, 2) # okay
4 add(1, '2') # type error

You can then fix these type errors and re-run mypy until all of the type errors have been resolved.
Using static type checking with mypy can help you catch type errors early in the development process
and ensure that your code is correct and stable.
Chapter 9: Type Annotations with Python 77

Advanced techniques for using type annotations


in Python

Use Union to specify multiple possible types:


You can use the Union type from the typing module to establish that a variable, argument, or return
value can be one of the multiple possible types. For example:

1 from typing import Union


2
3 def add(x: Union[int, float], y: Union[int, float]) -> Union[int, float]:
4 return x + y

In this example, the add function expects two arguments (x and y) that can be either integers or floats
and returns a value that can be either an integer or a float. The Union type allows you to specify
multiple possible types, which can be helpful in situations where a value can be one of several
different types.

Use Optional to specify an optional argument or return value:


You can use the Optional type from the typing module to specify that an argument or return value
is optional. For example:

1 from typing import Optional


2
3 def greet(name: str, greeting: Optional[str] = 'Hello') -> str:
4 return f'{greeting}, {name}!'

In this example, the greet function expects a string argument (name) and an optional string argument
(greeting). The greeting argument is annotated with Optional[str]`` and has a default value
of 'Hello,' so it is optional and does not need to be provided when the function is
called. Using the Optional‘ type can be useful to specify that an argument or return value is
optional and can be omitted if desired.

Use Any to specify that a value can be any type:


You can use the Any type from the typing module to specify that a variable, argument, or return
value can be any type. For example:
Chapter 9: Type Annotations with Python 78

1 from typing import Any


2
3 def process(x: Any):
4 # Do something with x
5 pass

In this example, the process function expects an argument (x) that can be any type. The Any type
is often used when a value can be any type, and it is not necessary or desirable to specify the exact
type.

Summary
• Type annotations in Python are optional and do not affect the runtime behavior of a program.
• They provide information about the type of a variable or function for static analysis tools and
enable type checking at runtime.
• They can improve the readability and reliability of Python programs by providing explicit
information about the expected type of a variable or function, enabling static analysis tools to
detect potential errors, and enabling type specialization.
• Type-checking tools, such as mypy or typeguard, can be used to validate the types of your
variables and functions at runtime and enforce the type annotations in your code.

Further Reading
• The Python documentation* provides detailed information about type annotations in Python,
including the syntax, semantics, and use cases.
• The mypy documentation† explains how to use mypy to enable type checking and type
specialization in Python.
• The typeguard documentation‡ explains how to use typeguard to validate the types of your
variables and functions at runtime.
• The PEP 484§ describes the design and motivation of type annotations in Python.

That concludes this chapter on Python’s type annotations. I hope you have learned something new
and valuable about type annotations in Python.
*https://docs.python.org/3/library/typing.html
†http://mypy-lang.org/index.html
‡https://typeguard.readthedocs.io/en/latest/
§https://peps.python.org/pep-0484/

You might also like