Use Python To Fill PDF Files! - AKDux
Use Python To Fill PDF Files! - AKDux
- AKDux
Search Here...
OCTOBER 31
PDFs are hard to work with. Over the years I've tried several approaches to filling them out in an
automated way. It's amazing my job has so many manual tasks that require filling out PDFs. It's
fairly routine for me to be manually filling out PDF files to process transactions. Needless to say I've
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 1/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
either created or borrowed several solutions. First let me say I'm no VBA expert but I have
experimented with solutions here as well.
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 2/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
Then came my experience with Python, PyPDF2 and reportlab. I won't go into too much detail about
exactly how I did this. In short you create your PDF template, create blank PDF with just your data
fields, and paste the new PDF as a watermark on top of your PDF template. Again, this is
painstaking because you're using grid coordinates to position where text should be placed on the
page. This worked, it was fast, but it wasn't great if the PDF template changed or if you wanted to
manipulate the PDF file afterward.
It was great when I found you could fill PDF form fields with python using PyPDF2 and pdfrw. Both
of these libraries look to be able to do similar tasks but I chose pdfrw because it appears to be
maintained better. PyPDF2 actually is no longer maintained. There is a PyPDF3 and PyPDF4;
however, I already settled on pdfrw. The only issue I ran into is that you could fill in the fields but
those values wouldn't show until you refreshed the field in Acrobat. I found two ways around this;
one was to click into every field and hit Enter. This option isn't doable if you have several PDFs. The
next was to open the PDFs in a web browser which causes a refresh of the fields.
Because of these challenges I gave up for a while... However, while digging into Python and PDFs
again I found the solution that refreshes the fields!
So now I have a working solution I can pass around the office easily. A basic macro reference a
Python exe file located on a shared network drive. Meaning there is no python install! And we can
populate PDF forms with a simple excel macro while still getting all the flexibility and functionality
of Python. The rest of this post will be going through an example of how to fill out a PDF using
python.
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 3/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
PDF Setup
I’m using Adobe Acrobat DC. I’m going to create a sample PDF file for this example. If you have an
existing PDF you want to use just open, click on Tools > Prepare Form. This action will create a
fillable PDF form.
Now let’s create a simple PDF for this example. We have the following fields.
name
phone
date
account_number
Now that we have a sample PDF we will get started with a little Python.
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 4/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
pdfrw Setup
First thing to do is install pdfrw using !pip install pdfrw
Python
Python
1 import pdfrw
2 pdfrw.__version__
3
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 5/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
'0.4'
Python
1 # Let's first set some variable to reference our PDF template and output.pdf
2 pdf_template = "template.pdf"
3 pdf_output = "output.pdf"
Python
1 template_pdf = pdfrw.PdfReader(pdf_template) # create a pdfrw object from our template.pdf
2 # template_pdf # uncomment to see all the data captured from this PDF.
You should print out template_pdf to see everything available in the PDF. There is a lot so for ease of
reading I’ll comment out.
For now let’s just try to get the form fields of the PDF we created. To do this we will set some of the
variable we find important. I grabbed this code from a random snippet online but you can find
several similar setups on stack overflow.
Python
1 ANNOT_KEY = '/Annots'
2 ANNOT_FIELD_KEY = '/T'
3 ANNOT_VAL_KEY = '/V'
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 6/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
4 ANNOT_RECT_KEY = '/Rect'
5 SUBTYPE_KEY = '/Subtype'
6 WIDGET_SUBTYPE_KEY = '/Widget'
Next, we can loop through the page(s). Here we only have one but you it’s a good idea to prepare
for future functionality. We grab all the annotations to grab just the form field keys.
Python
1 for page in template_pdf.pages:
2 annotations = page[ANNOT_KEY]
3 for annotation in annotations:
4 if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
5 if annotation[ANNOT_FIELD_KEY]:
6 key = annotation[ANNOT_FIELD_KEY][1:-1]
7 print(key)
name
phone
date
account_number
cb_1
cb_2
There you can see we were able to grab our form field names!
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 7/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
Filling a PDF
To fill a PDF we can create a dictionary of what we want to populate the PDF. The
dictionary keys will be the form field names and the values will be what we want to fill into the PDF.
Python
Let’s setup a function to handle grabbing the keys, populating the values, and saving out
the output.pdf file
Python
1
2 def fill_pdf(input_pdf_path, output_pdf_path, data_dict):
3 template_pdf = pdfrw.PdfReader(input_pdf_path)
4 for page in template_pdf.pages:
5 annotations = page[ANNOT_KEY]
6 for annotation in annotations:
7 if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
8
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 8/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
9 if annotation[ANNOT_FIELD_KEY]:
10 key = annotation[ANNOT_FIELD_KEY][1:-1]
11 if key in data_dict.keys():
12 if type(data_dict[key]) == bool:
13 if data_dict[key] == True:
14 annotation.update(pdfrw.PdfDict(
15 AS=pdfrw.PdfName('Yes')))
16 else:
17 annotation.update(
18 pdfrw.PdfDict(V='{}'.format(data_dict[key]))
19 )
20 annotation.update(pdfrw.PdfDict(AP=''))
pdfrw.PdfWriter().write(output_pdf_path, template_pdf)
Python
1 fill_pdf(pdf_template, pdf_output, data_dict)
Okay! That just filled out a PDF. Opening in preview on my Mac shows.
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 9/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
However, opening the very same PDF in Acrobat doesn’t show the values of the form fields. If you
click into the field you can see it did fill but for some reason the field isn’t refreshed to show the
value. Printing the PDF here won’t help either as it will print blank. After a long while searching for
an answer I found the following solution. Worked like a charm and the form fields are now showing
in Acrobat as well.
Honestly, I don’t know why this isn’t the default setting. It seems like everyone online runs into the
same issue and this solution seems hidden away to where there are several hard work-arounds
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 10/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
that are being used. Either way just add the above reference line to the fill_pdf function like so.
Python
1 def fill_pdf(input_pdf_path, output_pdf_path, data_dict):
2 template_pdf = pdfrw.PdfReader(input_pdf_path)
3 for page in template_pdf.pages:
4 annotations = page[ANNOT_KEY]
5 for annotation in annotations:
6 if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
7 if annotation[ANNOT_FIELD_KEY]:
8 key = annotation[ANNOT_FIELD_KEY][1:-1]
9 if key in data_dict.keys():
10 if type(data_dict[key]) == bool:
11 if data_dict[key] == True:
12 annotation.update(pdfrw.PdfDict(
13 AS=pdfrw.PdfName('Yes')))
14 else:
15 annotation.update(
16 pdfrw.PdfDict(V='{}'.format(data_dict[key]))
17 )
18 annotation.update(pdfrw.PdfDict(AP=''))
19 template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true'))) # NEW
20 pdfrw.PdfWriter().write(output_pdf_path, template_pdf)
I added one additional function fill_simple_pdf_file as I found it very useful to manipulate a data
dictionary, especially if it came from an excel file, first before populating the data. This way you can
create many fillable forms from the same data source, do formating on the fields and set default
values if nothing was supplied.
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 11/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
Python
1 import pdfrw
2 from datetime import date
3
4 ANNOT_KEY = '/Annots'
5 ANNOT_FIELD_KEY = '/T'
6 ANNOT_VAL_KEY = '/V'
7 ANNOT_RECT_KEY = '/Rect'
8 SUBTYPE_KEY = '/Subtype'
9 WIDGET_SUBTYPE_KEY = '/Widget'
10
11 def fill_pdf(input_pdf_path, output_pdf_path, data_dict):
12 template_pdf = pdfrw.PdfReader(input_pdf_path)
13 for page in template_pdf.pages:
14 annotations = page[ANNOT_KEY]
15 for annotation in annotations:
16 if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
17 if annotation[ANNOT_FIELD_KEY]:
18 key = annotation[ANNOT_FIELD_KEY][1:-1]
19 if key in data_dict.keys():
20 if type(data_dict[key]) == bool:
21 if data_dict[key] == True:
22 annotation.update(pdfrw.PdfDict(
23 AS=pdfrw.PdfName('Yes')))
24 else:
25 annotation.update(
26 pdfrw.PdfDict(V='{}'.format(data_dict[key]))
27 )
28 annotation.update(pdfrw.PdfDict(AP=''))
29 template_pdf.Root.AcroForm.update(pdfrw.PdfDict(NeedAppearances=pdfrw.PdfObject('true')))
30
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 12/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
31 pdfrw.PdfWriter().write(output_pdf_path, template_pdf)
32
33 # NEW
34 def fill_simple_pdf_file(data, template_input, template_output):
35 some_date = date.today()
36 data_dict = {
37 'name': data.get('name', ''),
38 'phone': data.get('phone', ''),
39 'date': some_date,
40 'account_number': data.get('account_number', ''),
41 'cb_1': data.get('cb_1', False),
42 'cb_2': data.get('cb_2', False),
43 }
44 return fill_pdf(template_input, template_output, data_dict)
45
46 if __name__ == '__main__':
47 pdf_template = "template.pdf"
48 pdf_output = "output.pdf"
49
50 sample_data_dict = {
51 'name': 'Andrew Krcatovich',
52 'phone': '(123) 123-1234',
53 # 'date': date.today(), # Removed date so we can dynamically set in python.
54 'account_number': '123123123',
55 'cb_1': True,
56 'cb_2': False,
57 }
fill_simple_pdf_file(sample_data_dict, pdf_template, pdf_output)
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 13/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
Python
from datetime import date
from pdfrw import PdfReader, PdfDict, PdfName, PdfObject, PdfWriter
ANNOT_KEY = '/Annots'
ANNOT_FIELD_KEY = '/T'
ANNOT_VAL_KEY = '/V'
ANNOT_RECT_KEY = '/Rect'
SUBTYPE_KEY = '/Subtype'
WIDGET_SUBTYPE_KEY = '/Widget'
data_dict = {
'account_number': '12312312',
'trade_date': date.today(),
}
template_pdf = PdfReader("test.pdf")
for page in template_pdf.pages:
annotations = page[ANNOT_KEY]
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 14/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
19
for annotation in annotations:
20
if annotation[SUBTYPE_KEY] == WIDGET_SUBTYPE_KEY:
21
# CHANGED: for example purposes
22
if not annotation[ANNOT_FIELD_KEY]:
23
if annotation['/Parent']: # note the '/Parent' widget
24
key = annotation['/Parent'][ANNOT_FIELD_KEY][1:-1] # so '/T' is inside the '/Pare
25
if key in data_dict.keys():
26
annotation['/Parent'].update(
27
PdfDict(V='{}'.format(data_dict[key]))
28
)
29
annotation['/Parent'].update(PdfDict(AP=''))
30
template_pdf.Root.AcroForm.update(PdfDict(NeedAppearances=PdfObject('true')))
31
PdfWriter().write("output.pdf", template_pdf)
Previous
TAGS
Get in touch
You may also like
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 15/16
29/3/24, 0:16 Use Python To Fill PDF Files! - AKDux
Email*
USEFUL LINKS C AT E G O R I E S C O N TA C T
Blog Running
Contact Uncategorized
Privacy policy (269) 355-0845
SOCIAL
https://akdux.com/python/2020/10/31/python-fill-pdf-files/ 16/16