Say Goodbye To Loops in Python, and Welcome Vectorization! - by Anmol Tomar - CodeX - Medium
Say Goodbye To Loops in Python, and Welcome Vectorization! - by Anmol Tomar - CodeX - Medium
Published in CodeX
You have 1 free member-only story left this month. Upgrade for unlimited access.
Save
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 1/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
Introduction
Loops come to us naturally, we learn about Loops in almost all programming
languages. So, by default, we start implementing loops whenever there is a repetitive
operation. But when we work with a large number of iterations (millions/billions of
rows), using loops is a crime. You might be stuck for hours, to later realize that it won’t
work. This is where implementing Vectorisation in python becomes super crucial.
What is Vectorization?
In this blog, we will look at some of the use cases where we can easily replace python
loops with Vectorization. This will help you save time and become more skillful in
coding.
Using Loops
import time
start = time.time()
# iterative sum
total = 0
# iterating through 1.5 Million numbers
for item in range(0, 1500000):
total = total + item
print(end - start)
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 2/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
#1124999250000
#0.14 Seconds
Using Vectorization
import numpy as np
start = time.time()
end = time.time()
print(end - start)
##1124999250000
##0.008 Seconds
Vectorization took ~18x lesser time to execute as compared to the iteration using the
range function. This difference will become more significant while working with
Open in app Get unlimited access
Pandas DataFrame.
In the following example, we can see how easily the loops can be replaced with
Vectorization for such use cases.
We are creating a pandas DataFrame having 5 Million rows and 4 columns filled with
random values between 0 and 50.
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 3/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0, 50, size=(5000000, 4)), columns=('a','b','c
df.shape
# (5000000, 5)
df.head()
We will create a new column ‘ratio’ to find the ratio of the column ‘d’ and ‘c’.
Using Loops
3.6K 49
import time
start = time.time()
Using Vectorization
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 4/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
start = time.time()
df["ratio"] = 100 * (df["d"] / df["c"])
end = time.time()
print(end - start)
### 0.12 seconds
We can see a significant improvement with DataFrame, the time taken by the
Vectorization operation is almost 1000x faster as compared to the loops in python.
Let’s look at the following example to understand it better (we will be using the
DataFrame that we created in use case 2):
Imagine we want to create a new column ‘e’ based on some conditions on the exiting
column ‘a’.
Using Loops
import time
start = time.time()
end = time.time()
print(end - start)
### Time taken: 177 seconds
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 5/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
Using Vectorization
# using vectorization
start = time.time()
df['e'] = df['b'] + df['c']
df.loc[df['a'] <= 25, 'e'] = df['b'] -df['c']
df.loc[df['a']==0, 'e'] = df['d']end = time.time()
print(end - start)
## 0.28007707595825195 sec
Time taken by the Vectorization operation is 600x faster as compared to the python
loops with if-else statements.
For example, to calculate the value of y for millions of rows in the following equation
of multi-linear regression:
The values of m1,m2,m3… are determined by solving the above equation using
millions of values corresponding to x1,x2,x3… (for simplicity, we will just look at a
simple multiplication step)
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 6/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
import numpy as np
# setting initial values of m
m = np.random.rand(1,5)
Using Loops
import numpy as np
m = np.random.rand(1,5)
x = np.random.rand(5000000,5)
total = 0
tic = time.process_time()
for i in range(0,5000000):
total = 0
for j in range(0,5):
total = total + x[i][j]*m[0][j]
zer[i] = total
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 7/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
toc = time.process_time()
print ("Computation time = " + str((toc - tic)) + "seconds")
Using Vectorization
tic = time.process_time()
#dot product
np.dot(x,m.T)
toc = time.process_time()
print ("Computation time = " + str((toc - tic)) + "seconds")
Conclusion
Vectorization in python is super fast and should be preferred over loops, whenever we
are working with very large datasets.
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 8/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
Start implementing it over time and you will become comfortable with thinking along
the lines of vectorization of your codes.
If you like to experience Medium yourself, consider supporting me and thousands of other
writers by signing up for a membership. It only costs $5 per month, it supports us, writers,
greatly, and you get to access all the amazing stories on Medium.
Give a tip
A weekly newsletter on what's going on around the tech and programming space Take a look.
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 9/10
7/5/23, 1:38 Say Goodbye to Loops in Python, and Welcome Vectorization! | by Anmol Tomar | CodeX | Medium
https://medium.com/codex/say-goodbye-to-loops-in-python-and-welcome-vectorization-e4df66615a52 10/10