Report For Project Python
Report For Project Python
Love it or loathe it, PowerPoint is widely used in most business settings. This
article will not debate the merits of PowerPoint but will show you how to use
python to remove some of the drudgery of PowerPoint by automating the creation of
PowerPoint slides using python.
Fortunately for us, there is an excellent python library for creating and updating
PowerPoint files: python-pptx. The API is very well documented so it is pretty easy
to use. The only tricky part is understanding the PowerPoint document structure
including the various master layouts and elements. Once you understand the basics,
it is relatively simple to automate the creation of your own PowerPoint slides.
This article will walk through an example of reading in and analyzing some Excel
data with pandas, creating tables and building a graph that can be embedded in a
PowerPoint file.
Before diving into some code samples, there are two key components you need to
understand: Slide Layouts and Placeholders. In the images below you can see an
example of two different layouts as well as the template’s placeholders where you
can populate your content.
In the image below, you can see that we are using Layout 0 and there is one
placeholder on the slide at index 1.
PowerPoint Layout 0
In this image, we use Layout 1 for a completely different look.
PowerPoint Layout 1
In order to make your life easier with your own templates, I created a simple
standalone script that takes a template and marks it up with the various elements.
I won’t explain all the code line by line but you can see analyze_ppt.py on github.
Here is the function that does the bulk of the work:
Let’s get things started with the inputs and basic shell of the program:
# Functions go here
if __name__ == "__main__":
args = parse_args()
df = pd.read_excel(args.report.name)
report_data = create_pivot(df)
create_chart(df, "report-image.png")
create_ppt(args.infile.name, args.outfile.name, report_data, "report-
image.png")
After we create our command line args, we read the source Excel file into a pandas
DataFrame. Next, we use that DataFrame as an input to create the Pivot_table
summary of the data:
The next piece of the analysis is creating a simple bar chart of sales performance
by account:
PowerPoint Graph
We have a chart and a pivot table completed. Now we are going to embed that
information into a new PowerPoint file based on a given PowerPoint template file.
Before I go any farther, there are a couple of things to note. You need to know
what layout you would like to use as well as where you want to populate your
content. In looking at the output of analyze_ppt.py we know that the title slide is
layout 0 and that it has a title attribute and a subtitle at placeholder 1.
Here is the start of the function that we use to create our output PowerPoint:
x
def create_ppt(input, output, report_data, chart):
""" Take the input powerpoint file and use it as the template for the output
file.
"""
prs = Presentation(input)
# Use the output from analyze_ppt to understand which layouts and placeholders
# to use
# Create a title slide first
title_slide_layout = prs.slide_layouts[0]
slide = prs.slides.add_slide(title_slide_layout)
title = slide.shapes.title
subtitle = slide.placeholders[1]
title.text = "Quarterly Report"
opssx subtitle.text = "Generated on {:%m-%d-%Y}".format(date.today())
This code creates a new presentation based on our input file, adds a single slide
and populates the title and subtitle on the slide. It looks like this:
From our previous analysis, we know that the graph slide we want to use is layout
index 8, so we create a new slide, add a title then add a picture into placeholder
1. The final step adds a subtitle at placeholder 2.
PowerPoint Chart
For the final portion of the presentation, we will create a table for each manager
with their sales performance.
PowerPoint Table
Creating tables in PowerPoint is a good news / bad news story. The good news is
that there is an API to create one. The bad news is that you can’t easily convert a
pandas DataFrame to a table using the built in API. However, we are very fortunate
that someone has already done all the hard work for us and created
PandasToPowerPoint.
If you want to run this on your own, the full code would look something like this:
Conclusion
One of the things I really enjoy about using python to solve real world business
problems is that I am frequently pleasantly surprised at the rich ecosystem of very
well thought out python tools already available to help with my problems. In this
specific case, PowerPoint is rarely a joy to use but it is a necessity in many
environments.
After reading this article, you should know that there is some hope for you next
time you are asked to create a bunch of reports in PowerPoint. Keep this article in
mind and see if you can find a way to automate away some of the tedium!