The Basics of Python For Loops A Tutorial - Learn Data Science With Dataquest
The Basics of Python For Loops A Tutorial - Learn Data Science With Dataquest
" The Basics of Python For Loops: A Learn data skills for
free
#
Tutorial Join 1M+ learners
% ! & #
Email address
Password
Start Now
This tutorial is for Python beginners, but if you’ve never written a line of code
before, you may want to start out by working through the beginning of our free-to-
start Python Fundamentals course, as we won’t be covering basic syntax here.
(An interable object, by the way, is any Python object we can iterate through, or
“loop” through, and return a single element at a time. Lists, for example, are
iterable and return a single list entry at a time, in the order entries are listed.
Strings are iterable and return one character at a time, in the order the characters
appear. Etc.)
You create a for loop by first defining the iterable object you’d like to loop through,
and then defining the actions you’d like to perform on each item in that iterable
object. For example, when iterating through a list, you first specify the list you’d
like to iterate through, and then specify what action you’d like to perform on each
list item.
Let’s look at a quick example: if we had a list of names stored in Python, we could
use a for loop to iterate through that list, printing each name until it reached the
end. Below, we’ll create our list of names, and then write a for loop that iterates
through it, printing each entry on the list in sequence.
Lily
Brad
Fatima
Zining
This code in this simple loop raises a question, though: where did the variable
name come from? We haven’t defined it previously in our code! But because for
loops iterate through lists, tuples, etc. in sequence, this variable can actually be
called almost anything. Python will interpret any variable name we put in that
spot as referring to each list entry in sequence as the loop executes.
This will be the case regardless what we call that variable. So if, for example, we
rewrite our code to replace name with x, we’ll get the same exact result:
for x in our_list:
print(x)
Lily
Brad
Fatima
Zining
Note that this technique works with any iterable object. For example, strings are
iterable, and we can use the same sort of for loop to iterate through each
character in a string:
L
i
l
y
To learn how to do this, let’s take a look at a more realistic scenario and explore
this small data table that contains some US prices and US EPA range estimates for
several electric cars.
We can express this same data set as a list of lists, like so:
You may have noticed that in the list above, our range and price numbers are
actually stored as strings rather than integers. It’s not uncommon that you’ll get
data stored in this way, but for analysis, we’d want to convert those strings into
integers so we can do some calculations with them. Let’s use a for loop to interate
through our list of lists, selecting the price entry in each list and changing it from
a string to an integer.
To do that, we need to do a few things. First, we need to skip the first row in our
table, since those are the column names and we will get an error if we attempt to
convert a non-numerical string like 'range' into an integer. We can do this using
list slicing to select each row a!er the first row using ev_data[1:]. (If you need to
brush up on this, or any other aspects of lists, check out our interactive course on
Python programming fundamentals).
Then, we’ll loop through the list of lists, and for each iteration we’ll select the
element in the range column, which is the second column in our table. We’ll
assign the value found in this column to a variable called 'range'. To do this,
we’ll use the index number 1 (in Python, the first entry in an iterable is at index 0,
the second entry is at index 1, etc.).
Finally, we’ll convert the range numbers to integers using Python’s built-in 'int()
function, and replace the original strings with these integers in our data set.
for row in ev_data[1:]: # loop through each row in ev_data starting with row 2 (index 1)
ev_range = row[1] # each car's range is found in column 2 (index 1)
ev_range = int(ev_range) # convert each range number from a string to an integer
row[1] = ev_range # assign range, which is now an integer, back to index 1 in each row
print(ev_data)
[['vehicle', 'range', 'price'], ['Tesla Model 3 LR', 310, '49900'], ['Hyundai Ioniq EV', 124, '30315'], ['Chevy Bolt'
Now that we’ve got those values stored as integers, we can also use a for loop to
do some calculations. Let’s say, for example, that we want to figure out the
average range of an EV on this list. We’d need to add the range numbers together,
and then divide them by the total number of cars in our list.
Again, we can use a for loop to select the specific column we need within our data
set. We’ll start by creating a variable called total_range where we can store the
sum of the ranges. Then we’ll write another for loop, again skipping the header
row, and again identifying the second column (index 1) as the range value.
A!er that, all we need to do is add this value to total_range within our for loop,
and then calculate the value using total_range divided by the number of cars
a!er the loop has completed.
(Note that we’ll calculate the number of cars by counting the length of our list,
minus the header row, in the code below. With a list as short as ours, we could also
simply divide by 3, since the number of cars is very easy to count, but that would
break our calculation if additional car data was added to the list. For that reason,
it’s better to use len() to calculate the length of our car list in code so that if
additional entries are added to our data set in the future, we can re-run this code
and it will still produce the correct answer.)
for row in ev_data[1:]: # loop through each row in ev_data starting with row 2 (index 1)
ev_range = row[1] # each car's range is found in column 2 (index 1)
total_range += ev_range # add this number to the number stored in total_range
number_of_cars = len(ev_data[1:]) # calculate the length of our list, minus the header row
224.0
Python for loops are powerful, and you can nest more complex instructions inside
of them. To demonstrate this, let’s repeat the above two steps for our 'price'
column, this time within a single For Loop.
for row in ev_data[1:]: # loop through each row in ev_data starting with row 2 (index 1)
price = row[2] # each car's price is found in column 3 (index 2)
price = int(price) # convert each price number from a string to an integer
row[2] = price # assign price, which is now an integer, back to index 2 in each row
total_price += price # add each car's price to total_price
number_of_cars = len(ev_data[1:]) # calculate the length of our list, minus the header row
38945.0
We can also nest other elements, like If Else statements and even other for loops,
within for loops.
For example, imagine we wanted to find every car with a range of greater than 200
miles in our list. We can start by creating a new empty list to hold our long-range
car data. Then, we’ll use a for loop to iterate through ev_data, the list of lists
containing car data we created earlier, appending a car’s row to our long-range list
only if the its range value is above 200:
long_range_car_list = [] # creating a new list to store our long range car data
for row in ev_data[1:]: # iterate through ev_data, skipping the header row
ev_range = row[1] # assign the range number, which is at index 1 in the row, to the range variable
if ev_range > 200: # append the whole row to long-range list if range is higher than 200
long_range_car_list.append(row)
print(long_range_car_list)
These operations would also be simple to perform by hand with such a tiny data
set, of course. But these same techniques will work on data sets with thousands
and thousands of rows, which can make cleaning, sorting, and analyzing huge
datasets into very quick work.
Range
For loops can be used in tandem with Python’s range() function to iterate
through each number in a specified range. For example:
5
6
7
8
Note that Python doesn’t include the maximum value of a range in the range
count, which is why the number 9 doesn’t appear above. If we wanted this code to
count from 5 to 9 including 9, we’d need to change range(5, 9) to range(5,
10):
5
6
7
8
9
If you only specify a single number in your range() function, Python will treat
that as the maximum value, and assign a default minimum value of zero:
for x in range(3):
print(x)
0
1
2
You can even add a third argument to the range() function to specify that you’d
like to count in increments of a specific number. As you can see above, the default
value is 1, but if you add a third argument of 3, for example, you can use range()
with a for loop to count up in threes:
0
3
6
Break
By default, a Python for loop will loop through each possible iteration of the
interable object you’ve assigned it. Normally when we’re using a for loop, that’s
fine, because we want to perform the same action on each item in our list (for
example).
Sometimes, though, we may want to stop your loop if a certain condition is met. In
that circumstance, the break statement is useful. When used with an if statement
inside of a for loop, break allows us to break out of that loop before its
conclusion.
Let’s take a look at a quick example first, using the list of names we created earlier
called our_list):
When we run this code, nothing is printed. That’s because the break statement
comes before print(name) in our for loop. When Python sees break, it stops
executing the for loop and code that appears a!er break in the loop doesn’t get
run.
Let’s add an if statement to this loop, so that we break out of the loop when
Python gets to the name Zining:
Lily
Brad
Fatima
Here, we can see that the name Zining wasn’t printed. Here’s what’s happening
with each loop iteration:
Let’s return to the code we wrote for collecting long-range EV car data and work
through one more example. We’ll insert a break statement that stops the look as
soon as it encounters the string 'Tesla':
for row in ev_data[1:]: # iterate through ev_data as before looking for cars with a range > 200
ev_range = row[1]
if ev_range > 200:
long_range_car_list.append(row)
if 'Tesla' in row[0]: # but if 'Tesla' appears in the vehicle column, end the loop
break
print(long_range_car_list)
In the code above, we can see that the Tesla was still added to
long_range_car_list, because we appended it to that list before the if
statement where we used break. The Chevy Bolt was not added to our list,
because although it does have a range of more than 200 miles, break ended the
loop before Python reached the Chevy Bolt row.
(Remember, for loops execute in sequential order. If the Bolt was listed before the
Tesla in our original data set, it would have been included in
long_range_car_list).
Continue
When we’re looping through an iterable object like a list, we might also encounter
situations where we’d like to skip a particular row or rows. For simple situations
like skipping a header row, we can use list slicing, but if we want to skip rows
based on more complex conditions, this quickly becomes impractical. Instead, we
can use the continue statement to skip a single iteration (“loop”) of a for loop
and move to the next.
When Python sees continue while executing a for loop on a list, for example, it
will stop at that point and move on to the next item on the list. Any code that
comes below the continue will not be executed.
Let’s go back our list of names (our_names) and use continue with an if
statement to end a loop iteration before printing if the name is ‘Brad’:
Lily
Fatima
Zining
Above, we can see that Brad’s name was skipped, and the rest of the names in our
list were printed in sequence. That illustrates the di"erence between break and
continue in a nutshell:
break ends the loop entirely. When Python executes break, the for loop is
over.
continue ends a specific iteration of the loop and moves to the next item in
the list. When Python executes continue it moves immediately to the next
loop iteration, but it does not end the loop entirely.
To get some more practice with continue, let’s make a list of short-range EVs,
using continue to take a slightly di"erent approach. Instead of identifying the EVs
with less than 200 miles of range, we’ll write a for loop that adds every EV to our
short-range list, but with a continue statement before we append to the new list
that runs if the range is greater than 200:
print(short_range_car_list)
That’s probably not the most e"icient and readable way to create our short-range
car list, but it does demonstrate how continue works, so let’s walk through
precisely what’s happening here.
On its first loop, Python is looking at the Tesla row. That car does have an EV range
of more than 200 miles, so Python sees the if statement is true, and executes the
continue nested inside that if statement, which makes it immediately jump to the
next row of ev_data to begin its next loop.
On the second loop, Python is looking at the next row, which is the Hyundai row.
That car has a range of under 200 miles, so Python sees that the conditional if
statement is not met, and executes the rest of the code in the for loop, appending
the Hyundai row to short_range_car_list.
On the third and final loop, Python is looking at the Chevy row. That car has a
range of more than 200 miles, which means the conditional if statement is true.
Thus, Python once again executes the nested continue, which concludes the loop
and, since there are no more rows of data in our data set, ends the for loop
entirely.
Additional Resources
Hopefully at this point, you’re feeling comfortable with for loops in Python, and
you have an idea of how they can be useful for common data science tasks like
data cleaning, data preparation, and data analysis.
Ready to take the next step? Here are some additional resources to check out:
Advanced Python For Loops Tutorial – Learn to use for loops with NumPy,
Pandas, and other more advanced techniques in this “sequel” to this tutorial.
Python tutorials — Our ever-expanding list of Python tutorials for data
science.
Data Science Courses — Take your studies to the next level with fully
interactive programming, data science, and stats courses, right in your
browser.
Python’s o"icial documentation on For Loops – The o"icial documentation
doesn’t go into as much depth as this tutorial, but it does review the basics of
For Loops explain some related concepts like While Loops.
Dataquest’s Python Fundamentals for Data Science course – Our Python
fundamentals course o"ers a from-scratch introduction to coding in Python
for data science. It covers lists, loops, and a whole lot more, and you can code
iteractively right from within your browser.
Dataquest’s Intermediate Python for Data Science course – When you feel like
you’ve mastered For Loops and other core Python concepts, this is another
interactive course that’ll help you take your Python skills to the next level.
Free Data Sets for Practice – Practice For Loops on your own by grabbing a free
data set from one of these sources and applying your new skills to large, real-
world data sets. The data sets in the first section (for data visualization)
should work particularly well for practice projects since they should already
be relatively clean.
Practice your Python programming skills Commit to your study with our
as you work through our free tutorials. interactive, in-your-browser data science
courses in Python, R, SQL, and more.
beginner break continue electric cars ev data for loop for loops python
tutorial Tutorials
Tutorial: K Nearest Neighbors in Python Tutorial: How to Easily Read Files in Python
(Text, CSV, JSON)
R E A D M O R E
R E A D M O R E