A Guide To Excel Spreadsheets in Python With Openpyxl
A Guide To Excel Spreadsheets in Python With Openpyxl
realpython.com
49-63 minutes
Watch Now This tutorial has a related video course created by the
Real Python team. Watch it together with the written tutorial to
deepen your understanding: Editing Excel Spreadsheets in
Python With openpyxl
Excel spreadsheets are one of those things you might have to deal
with at some point. Either it’s because your boss loves them or
because marketing needs them, you might have to learn how to
work with spreadsheets, and that’s when knowing openpyxl
comes in handy!
If you ever get asked to extract some data from a database or log
file into an Excel spreadsheet, or if you often have to convert an
Excel spreadsheet into some more usable programmatic form, then
this tutorial is perfect for you. Let’s jump into the openpyxl
caravan!
First things first, when would you need to use a package like
openpyxl in a real-world scenario? You’ll see a few examples
below, but really, there are hundreds of possible scenarios where
this knowledge could come in handy.
You are responsible for tech in an online store company, and your
1 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
boss doesn’t want to pay for a cool and expensive CMS system.
Every time they want to add new products to the online store, they
come to you with an Excel spreadsheet with a few hundred rows
and, for each of them, you have the product name, description,
price, and so forth.
Say you have a Database table where you record all your users’
information, including name, phone number, email address, and so
forth.
Now, the Marketing team wants to contact all users to give them
some discounted offer or promotion. However, they don’t have
access to the Database, or they don’t know how to use SQL to
extract that information easily.
What can you do to help? Well, you can make a quick script using
openpyxl that iterates over every single User record and puts all
the essential information into an Excel spreadsheet.
For example, using the online store scenario again, say you get an
Excel spreadsheet with a list of users and you need to append to
each row the total amount they’ve spent in your store.
Here’s a quick list of basic terms you’ll see when you’re working
with Excel spreadsheets:
Term Explanation
2 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Term Explanation
Now that you’re aware of the benefits of a tool like openpyxl, let’s
get down to it and start by installing the package. For this tutorial,
you should use Python 3.7 and openpyxl 2.6.2. To install the
package, you can do the following:
After you install the package, you should be able to create a super
simple spreadsheet with the following code:
workbook = Workbook()
sheet = workbook.active
sheet["A1"] = "hello"
sheet["B1"] = "world!"
workbook.save(filename="hello_world.xlsx")
Let’s start with the most essential thing one can do with a
spreadsheet: read it.
3 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Before you dive deep into some code examples, you should
download this sample dataset and store it somewhere as
sample.xlsx:
>>>
>>> sheet.title
'Sheet 1'
Now, after opening a spreadsheet, you can easily retrieve data from
it like this:
>>>
>>> sheet["A1"]
<Cell 'Sheet 1'.A1>
>>> sheet["A1"].value
'marketplace'
>>> sheet["F10"].value
"G-Shock Men's Grey Sport Watch"
>>>
4 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
You can see that the results returned are the same, no matter
which way you decide to go with. However, in this tutorial, you’ll be
mostly using the first approach: ["A1"].
This section is where you’ll learn all the different ways you can
iterate through the data, but also how to convert that data into
something usable and, more importantly, how to do it in a Pythonic
way.
There are a few different ways you can iterate through the data
depending on your needs.
You can slice the data with a combination of columns and rows:
>>>
>>> sheet["A1:C2"]
((<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell
'Sheet 1'.C1>),
(<Cell 'Sheet 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell
'Sheet 1'.C2>))
>>>
5 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
You’ll notice that all of the above examples return a tuple. If you
want to refresh your memory on how to handle tuples in Python,
check out the article on Lists and Tuples in Python.
.iter_rows()
.iter_cols()
min_row
max_row
min_col
6 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
max_col
>>>
You’ll notice that in the first example, when iterating through the
rows using .iter_rows(), you get one tuple element per row
selected. While when using .iter_cols() and iterating through
columns, you’ll get one tuple per column instead.
>>>
If you want to iterate through the whole dataset, then you can also
use the attributes .rows or .columns directly, which are shortcuts
to using .iter_rows() and .iter_cols() without any
arguments:
>>>
7 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
...
<Cell 'Sheet 1'.M100>, <Cell 'Sheet 1'.N100>,
<Cell 'Sheet 1'.O100>)
These shortcuts are very useful when you’re iterating through the
whole dataset.
Now that you know the basics of iterating through the data in a
workbook, let’s look at smart ways of converting that data into
Python structures.
As you saw earlier, the result from all iterations comes in the form
of tuples. However, since a tuple is nothing more than an
immutable list, you can easily access its data and transform it
into other structures.
For example, say you want to extract product information from the
sample.xlsx spreadsheet and into a dictionary where each key is
a product ID.
First of all, have a look at the headers and see what information
you care most about:
>>>
This code returns a list of all the column names you have in the
spreadsheet. To start, grab the columns with names:
product_id
product_parent
product_title
product_category
Lucky for you, the columns you need are all next to each other so
you can use the min_column and max_column to easily get the
data you want:
>>>
8 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Nice! Now that you know how to get all the important product
information you need, let’s put that data into a dictionary:
import json
from openpyxl import load_workbook
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
products = {}
{
"B00FALQ1ZC": {
"parent": 937001370,
"title": "Invicta Women's 15150 ...",
"category": "Watches"
},
"B00D3RGO20": {
"parent": 484010722,
"title": "Kenneth Cole New York ...",
"category": "Watches"
}
}
Here you can see that the output is trimmed to 2 products only, but
if you run the script as it is, then you should get 98 products.
9 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
To finalize the reading section of this tutorial, let’s dive into Python
classes and see how you could improve on the example above and
better structure the data.
For this, you’ll be using the new Python Data Classes that are
available from Python 3.7. If you’re using an older version of
Python, then you can use the default Classes instead.
So, first things first, let’s look at the data you have and decide what
you want to store and how you want to store it.
As you saw right at the start, this data comes from Amazon, and it’s
a list of product reviews. You can check the list of all the columns
and their meaning on Amazon.
There are two significant elements you can extract from the data
available:
1. Products
2. Reviews
A Product has:
ID
Title
Parent
Category
ID
Customer ID
Stars
Headline
Body
Date
You can ignore a few of the review fields to make things a bit
simpler.
import datetime
from dataclasses import dataclass
@dataclass
class Product:
id: str
parent: str
title: str
category: str
@dataclass
class Review:
id: str
customer_id: str
10 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
stars: int
headline: str
body: str
date: datetime.datetime
After defining your data classes, you need to convert the data from
the spreadsheet into these new structures.
Before doing the conversion, it’s worth looking at our header again
and creating a mapping between columns and the fields you need:
>>>
>>> # Or an alternative
>>> for cell in sheet[1]:
... print(cell.value)
marketplace
customer_id
review_id
product_id
product_parent
...
Let’s create a file mapping.py where you have a list of all the field
names and their column location (zero-indexed) on the
spreadsheet:
# Product fields
PRODUCT_ID = 3
PRODUCT_PARENT = 4
PRODUCT_TITLE = 5
PRODUCT_CATEGORY = 6
# Review fields
REVIEW_ID = 2
REVIEW_CUSTOMER = 1
REVIEW_STARS = 7
REVIEW_HEADLINE = 12
REVIEW_BODY = 13
REVIEW_DATE = 14
You don’t necessarily have to do the mapping above. It’s more for
readability when parsing the row data, so you don’t end up with a
lot of magic numbers lying around.
Finally, let’s look at the code needed to parse the spreadsheet data
into a list of product and review objects:
11 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
products = []
reviews = []
category=row[PRODUCT_CATEGORY])
products.append(product)
review = Review(id=row[REVIEW_ID],
customer_id=row[REVIEW_CUSTOMER],
stars=row[REVIEW_STARS],
headline=row[REVIEW_HEADLINE],
body=row[REVIEW_BODY],
date=parsed_date)
reviews.append(review)
print(products[0])
print(reviews[0])
After you run the code above, you should get some output like this:
That’s it! Now you should have the data in a very simple and
digestible class format, and you can start thinking of storing this in a
12 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
13 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
The highlighted lines in the code above are the most important
ones for writing. In the code, you can see that:
Line 11 shows you how to save the spreadsheet when you’re done.
Even though these lines above can be straightforward, it’s still good
to know them well for when things get a bit more complicated.
One thing you can do to help with coming code examples is add the
following method to your Python file or console:
>>>
Before you get into the more advanced topics, it’s good for you to
know how to manage the most simple elements of a spreadsheet.
>>>
There’s another way you can do this, by first selecting a cell and
then changing its value:
>>>
>>> cell.value
'hello'
14 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
The new value is only stored into the spreadsheet once you call
workbook.save().
The openpyxl creates a cell when adding a value, if that cell didn’t
exist before:
>>>
As you can see, when trying to add a value to cell B10, you end up
with a tuple with 10 rows, just so you can have that test value.
.insert_rows()
.delete_rows()
.insert_cols()
.delete_cols()
1. idx
2. amount
>>>
>>> print_rows()
15 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
('hello', 'world!')
The only thing you need to remember is that when inserting new
data (rows or columns), the insertion happens before the idx
parameter.
16 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Managing Sheets
If you look back at the code examples from this tutorial, you’ll notice
the following recurring piece of code:
>>>
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> workbook.sheetnames
['New Products', 'Company Sales']
If you want to create or delete sheets, then you can also do that
with .create_sheet() and .remove():
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> operations_sheet =
workbook.create_sheet("Operations")
>>> workbook.sheetnames
['Products', 'Company Sales', 'Operations']
17 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
>>> workbook.remove(hr_sheet)
>>> workbook.sheetnames
['Products', 'Company Sales']
>>>
>>> workbook.sheetnames
['Products', 'Company Sales']
>>> workbook.sheetnames
['Products', 'Company Sales', 'Products Copy']
If you open your spreadsheet after saving the above code, you’ll
notice that the sheet Products Copy is a duplicate of the sheet
Products.
>>>
>>> workbook =
load_workbook(filename="sample.xlsx")
>>> sheet = workbook.active
>>> sheet.freeze_panes = "C2"
>>> workbook.save("sample_frozen.xlsx")
18 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Notice how you’re at the end of the spreadsheet, and yet, you can
see both row 1 and columns A and B.
Adding Filters
At first, this might seem like a pretty useless feature, but when
you’re programmatically creating a spreadsheet that is going to be
sent and used by somebody else, it’s still nice to at least create the
filters and allow people to use it afterward.
The code below is an example of how you would add some filters to
our existing sample.xlsx spreadsheet:
>>>
You should now see the filters created when opening the
spreadsheet in your editor:
19 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Adding Formulas
>>>
Starting with something easy, let’s check the average star rating for
the 99 reviews within the spreadsheet:
>>>
If you open the spreadsheet now and go to cell P2, you should see
that its value is: 4.18181818181818. Have a look in the editor:
20 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
You can use the same methodology to add any formulas to your
spreadsheet. For example, let’s count the number of reviews that
had helpful votes:
>>>
You should get the number 21 on your P3 spreadsheet cell like so:
You’ll have to make sure that the strings within a formula are
always in double quotes, so you either have to use single quotes
around the formula like in the example above or you’ll have to
escape the double quotes inside the formula:
"=COUNTIF(I2:I100, \">0\")".
There are a ton of other formulas you can add to your spreadsheet
using the same procedure you tried above. Give it a go yourself!
Adding Styles
>>>
21 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
>>> double_border_side =
Side(border_style="double")
>>> square_border = Border(top=double_border_side,
...
right=double_border_side,
...
bottom=double_border_side,
...
left=double_border_side)
If you open your spreadsheet now, you should see quite a few
different styles on the first 5 cells of column A:
You can also combine styles by simply adding them to the cell at
the same time:
>>>
22 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
When you want to apply multiple styles to one or several cells, you
can use a NamedStyle class instead, which is like a style template
that you can use over and over again. Have a look at the example
below:
>>>
>>> workbook.save(filename="sample_styles.xlsx")
If you open the spreadsheet now, you should see that its first row is
bold, the text is aligned to the center, and there’s a small bottom
border! Have a look below:
23 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Conditional Formatting
You can start by adding a simple one that adds a red background to
all reviews with less than 3 stars:
>>>
>>> red_background =
PatternFill(fgColor="00FF0000")
>>> diff_style =
DifferentialStyle(fill=red_background)
>>> rule = Rule(type="expression", dxf=diff_style)
>>> rule.formula = ["$H1<3"]
>>> sheet.conditional_formatting.add("A1:O100",
rule)
>>>
workbook.save("sample_conditional_formatting.xlsx")
Now you’ll see all the reviews with a star rating below 3 marked
with a red background:
Code-wise, the only things that are new here are the objects
24 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Rule is responsible for selecting the cells and applying the styles if
the cells match the rule’s logic.
ColorScale
IconSet
DataBar
>>>
25 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
You can also add a third color and make two gradients instead:
>>>
This time, you’ll notice that star ratings between 1 and 3 have a
gradient from red to yellow, and star ratings between 3 and 5 have
a gradient from yellow to green:
The IconSet allows you to add an icon to the cell according to its
value:
>>>
26 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
[1, 2, 3, 4, 5])
>>> sheet.conditional_formatting.add("H2:H100",
icon_set_rule)
>>>
workbook.save("sample_conditional_formatting_icon_set.xlsx")
You’ll see a colored arrow next to the star rating. This arrow is red
and points down when the value of the cell is 1 and, as the rating
gets better, the arrow starts pointing up and becomes green:
The openpyxl package has a full list of other icons you can use,
besides the arrow.
>>>
You’ll now see a green progress bar that gets fuller the closer the
star rating is to the number 5:
27 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
As you can see, there are a lot of cool things you can do with
conditional formatting.
Here, you saw only a few examples of what you can achieve with it,
but check the openpyxl documentation to see a bunch of other
options.
Adding Images
Even though images are not something that you’ll often see in a
spreadsheet, it’s quite cool to be able to add them. Maybe you can
use it for branding purposes or to make spreadsheets more
personal.
Apart from that, you’ll also need an image. For this example, you
can grab the Real Python logo below and convert it from .webp to
.png using an online converter such as cloudconvert.com, save
the final file as logo.png, and copy it to the root folder where
you’re running your examples:
Afterward, this is the code you need to import that image into the
hello_word.xlsx spreadsheet:
28 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
logo = Image("logo.png")
sheet.add_image(logo, "A3")
workbook.save(filename="hello_world_logo.xlsx")
The image’s left top corner is on the cell you chose, in this case,
A3.
For any chart you want to build, you’ll need to define the chart type:
BarChart, LineChart, and so forth, plus the data to be used for
the chart, which is called Reference.
Before you can build your chart, you need to define what data you
want to see represented in it. Sometimes, you can use the dataset
as is, but other times you need to massage the data a bit to get
additional information.
29 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
8rows = [
9 ["Product", "Online", "Store"],
10 [1, 30, 45],
11 [2, 40, 30],
12 [3, 40, 25],
13 [4, 50, 30],
14 [5, 30, 25],
15 [6, 25, 35],
16 [7, 20, 40],
17]
18
19for row in rows:
20 sheet.append(row)
Now you’re going to start by creating a bar chart that displays the
total number of sales per product:
22chart = BarChart()
23data = Reference(worksheet=sheet,
24 min_row=1,
25 max_row=8,
26 min_col=2,
27 max_col=3)
28
29chart.add_data(data, titles_from_data=True)
30sheet.add_chart(chart, "E2")
31
32workbook.save("chart.xlsx")
There you have it. Below, you can see a very straightforward bar
chart showing the difference between online product sales online
and in-store product sales:
Like with images, the top left corner of the chart is on the cell you
added the chart to. In your case, it was on cell E2.
1import random
2from openpyxl import Workbook
3from openpyxl.chart import LineChart, Reference
4
5workbook = Workbook()
30 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
6sheet = workbook.active
7
8# Let's create some sample sales data
9rows = [
10 ["", "January", "February", "March",
"April",
11 "May", "June", "July", "August",
"September",
12 "October", "November", "December"],
13 [1, ],
14 [2, ],
15 [3, ],
16]
17
18for row in rows:
19 sheet.append(row)
20
21for row in sheet.iter_rows(min_row=2,
22 max_row=4,
23 min_col=2,
24 max_col=13):
25 for cell in row:
26 cell.value = random.randrange(5, 100)
With the above code, you’ll be able to generate some random data
regarding the sales of 3 different products across a whole year.
Once that’s done, you can very easily create a line chart with the
following code:
28chart = LineChart()
29data = Reference(worksheet=sheet,
30 min_row=2,
31 max_row=4,
32 min_col=1,
33 max_col=13)
34
35chart.add_data(data, from_rows=True,
titles_from_data=True)
36sheet.add_chart(chart, "C6")
37
38workbook.save("line_chart.xlsx")
31 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
One thing to keep in mind here is the fact that you’re using
from_rows=True when adding the data. This argument makes
the chart plot row by row instead of column by column.
In your sample data, you see that each product has a row with 12
values (1 column per month). That’s why you use from_rows. If
you don’t pass that argument, by default, the chart tries to plot by
column, and you’ll get a month-by-month comparison of sales.
There are a couple of other things you can also change regarding
the style of the chart. For example, you can add specific categories
to the chart:
cats = Reference(worksheet=sheet,
min_row=1,
max_row=1,
min_col=2,
max_col=13)
chart.set_categories(cats)
Add this piece of code before saving the workbook, and you should
see the month names appearing instead of numbers:
chart.x_axis.title = "Months"
chart.y_axis.title = "Sales (per unit)"
32 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
As you can see, small changes like the above make reading your
chart a much easier and quicker task.
With the style selected above, all lines have some shade of orange:
Here’s the full code used to generate the line chart with categories,
axis titles, and style:
import random
from openpyxl import Workbook
from openpyxl.chart import LineChart, Reference
workbook = Workbook()
sheet = workbook.active
33 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
[2, ],
[3, ],
]
# Save!
sheet.add_chart(chart, "C6")
workbook.save("line_chart.xlsx")
There are a lot more chart types and customization you can apply,
so be sure to check out the package documentation on this if you
need some specific formatting.
Let’s imagine you have a database and are using some Object-
34 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
@dataclass
class Sale:
quantity: int
@dataclass
class Product:
id: str
name: str
sales: List[Sale]
1import random
2
3# Ignore these for now. You'll use them in a sec
;)
4from openpyxl import Workbook
5from openpyxl.chart import LineChart, Reference
6
7from db_classes import Product, Sale
8
9products = []
10
11# Let's create 5 products
12for idx in range(1, 6):
13 sales = []
14
15 # Create 5 months of sales
16 for _ in range(5):
17 sale = Sale(quantity=random.randrange(5,
100))
18 sales.append(sale)
19
20 product = Product(id=str(idx),
21 name="Product %s" % idx,
22 sales=sales)
23 products.append(product)
Now, to convert this into a spreadsheet, you need to iterate over the
data and append it to the spreadsheet:
25workbook = Workbook()
26sheet = workbook.active
35 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
27
28# Append column names first
29sheet.append(["Product ID", "Product Name",
"Month 1",
30 "Month 2", "Month 3", "Month 4",
"Month 5"])
31
32# Append the data
33for product in products:
34 data = [product.id, product.name]
35 for sale in product.sales:
36 data.append(sale.quantity)
37 sheet.append(data)
That’s it. That should allow you to create a spreadsheet with some
data coming from your database.
However, why not use some of that cool knowledge you gained
recently to add a chart as well to display that data more visually?
38chart = LineChart()
39data = Reference(worksheet=sheet,
40 min_row=2,
41 max_row=6,
42 min_col=2,
43 max_col=7)
44
45chart.add_data(data, titles_from_data=True,
from_rows=True)
46sheet.add_chart(chart, "B8")
47
48cats = Reference(worksheet=sheet,
49 min_row=1,
50 max_row=1,
51 min_col=3,
52 max_col=7)
53chart.set_categories(cats)
54
55chart.x_axis.title = "Months"
56chart.y_axis.title = "Sales (per unit)"
57
58workbook.save(filename="oop_sample.xlsx")
36 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
Even though you can use Pandas to handle Excel files, there are
few things that you either can’t accomplish with Pandas or that
you’d be better off just using openpyxl directly.
But guess what, you don’t have to worry about picking. In fact,
openpyxl has support for both converting data from a Pandas
DataFrame into a workbook or the opposite, converting an
openpyxl workbook into a Pandas DataFrame.
1import pandas as pd
2
3data = {
4 "Product Name": ["Product 1", "Product 2"],
5 "Sales Month 1": [10, 20],
6 "Sales Month 2": [5, 35],
7}
8df = pd.DataFrame(data)
37 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
import pandas as pd
from openpyxl import load_workbook
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
values = sheet.values
df = pd.DataFrame(values)
Alternatively, if you want to add the correct headers and use the
review ID as the index, for example, then you can also do it like this
instead:
import pandas as pd
from openpyxl import load_workbook
from mapping import REVIEW_ID
workbook = load_workbook(filename="sample.xlsx")
sheet = workbook.active
data = sheet.values
Using indexes and columns allows you to access data from your
DataFrame easily:
>>>
>>> df.columns
Index(['marketplace', 'customer_id', 'review_id',
'product_id',
'product_parent', 'product_title',
38 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
'product_category', 'star_rating',
'helpful_votes', 'total_votes', 'vine',
'verified_purchase',
'review_headline', 'review_body',
'review_date'],
dtype='object')
There you go, whether you want to use openpyxl to prettify your
Pandas dataset or use Pandas to do some hardcore algebra, you
now know how to switch between both packages.
Conclusion
Phew, after that long read, you now know how to work with
spreadsheets in Python! You can rely on openpyxl, your
trustworthy companion, to:
There are a few other things you can do with openpyxl that might
not have been covered in this tutorial, but you can always check the
package’s official documentation website to learn more about it.
39 of 40 2/28/2022, 3:50 AM
A Guide to Excel Spreadsheets in Python With openpyxl about:reader?url=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Frealpython.com%2Fopenpyxl-excel...
You can even venture into checking its source code and improving
the package further.
Feel free to leave any comments below if you have any questions,
or if there’s any section you’d love to hear more about.
Watch Now This tutorial has a related video course created by the
Real Python team. Watch it together with the written tutorial to
deepen your understanding: Editing Excel Spreadsheets in
Python With openpyxl
40 of 40 2/28/2022, 3:50 AM