How to count the number of pages in a PDF file in Python
Last Updated :
18 Mar, 2024
In this article, we will see how can we count the total number of pages in a PDF file in Python,
For this article there is no such prerequisite, we will use PyPDF2 library for this purpose. PyPDF2 is a free and open-source pure-Python PyPDF library capable of performing many tasks like splitting, merging, cropping, and transforming the pages of PDF files. It can also add custom data, viewing options, and passwords to PDF files. PyPDF2 can retrieve text and metadata from PDFs as well. Refer to this "Working with PDF files in Python" to explore about PyPDF2
Installing required library
Execute the below command to install the PyPDF2 library in the command prompt or terminal.
pip install PyPDF2
Step to Count the number of pages in a PDF file
Step 1: Import PyPDF2 library into the Python program
import PyPDF2
Step 2: Open the PDF file in read binary format using file handling
file = open('your pdf file path', 'rb')
Step 3: Read the pdf using the PdfReader() function of the PyPDF2 library
pdfReader = PyPDF2.PdfReader(file)
Note: These above three steps are similar for all methods that we are going to see using an example.
Methods to count PDF pages
We are going to learn three methods to count the number of pages in a PDF file which are as follows:
- By using the len(pdfReader.pages) property.
- By using the getNumPages() method.
- By using the pages property and len() function.
Method 1: Using len(pdfReader.pages) property
len(pdfReader.pages) is a property of PdfReader Class that returns the total number of pages in the PDF file.
totalPages1 = len(pdfReader.pages)
For Example:
Python3
# importing PyPDF2 library
import PyPDF2
# opened file as reading (r) in binary (b) mode
file = open('/home/hardik/GFG_Temp/dbmsFile.pdf',
'rb')
# store data in pdfReader
pdfReader = PyPDF2.PdfReader(file)
# count number of pages
totalPages = len(pdfReader.pages)
# print number of pages
print(f"Total Pages: {totalPages}")
Output:
Total Pages: 10
In the above example, we imported the PyPDF2 module and opened the file using file handling in read binary format after that with the help of PdfReader() function of PyPDF2 module we read the pdf file which we opened previously, then with the help of the numPages property of the module we counted the total pages of PDF file and stored the total number of pages in a variable "totalPages" for further usage and at last, we print the variable holding the total page count of PDF file.
Method 2: Using getNumPages() method
getNumPages() is a method of PdfReader class that returns an integer specifying a total number of pages and it takes no argument this method is deprecated since version 1.28.0 but we can still use another method that comes in its replacement is next method discussed.
totalPages2 = pdfReader.getNumPages()
Python3
# importing PyPDF2 library
import PyPDF2
# opened file as reading (r) in binary (b) mode
file = open('/home/hardik/GFG_Temp/dbmsFile.pdf',
'rb')
# store data in pdfReader
pdfReader = PyPDF2.PdfReader(file)
# count number of pages
totalPages = pdfReader.getNumPages()
# print number of pages
print(f"Total Pages: {totalPages}")
Output:
Total Pages: 10
In the above example, we imported the PyPDF2 module and opened the file using file handling in reading binary format after that with the help of the PdfReader() function of PyPDF2 module we read the pdf file that we opened previously, then with the help of getNumPages() method of the module we counted the total pages of PDF file and stored the total number of pages in a variable "totalpages" for further usage and at last, we print the variable holding the total page count of PDF file.
Method 3: Using pages property and len() function
pages is a read-only property that emulates a list of Page objects and using len() function which is Python's inbuilt function to count the length of a sequence is used combinedly to determine the total pages of the PDF.
totalPages3 = len(pdfReader.pages)
Python3
# importing PyPDF2 library
import PyPDF2
# opened file as reading (r) in binary (b) mode
file = open('/home/hardik/GFG_Temp/dbmsFile.pdf',
'rb')
# store data in pdfReader
pdfReader = PyPDF2.PdfReader(file)
# count number of pages
totalPages = len(pdfReader.pages)
# print number of pages
print(f"Total Pages: {totalPages}")
Output:
Total Pages: 10
In the above example we imported the PyPDF2 module and opened the file using file handling in read binary format then with the help of PdfReader() function of PyPDF2 module we read the pdf file which we opened previously, then with the help of the pages property of the module we get the list of all the pages of PDF file and with the help of len() function we counted the total pages returned by pages property and stored the total number of pages in a variable "totalpages" for further usage and at last, we print the variable holding the total page count of PDF file.
Similar Reads
How to count the number of lines in a CSV file in Python?
CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the
2 min read
Count number of lines in a text file in Python
Counting the number of characters is important because almost all the text boxes that rely on user input have a certain limit on the number of characters that can be inserted. For example, If the file is small, you can use readlines() or a loop approach in Python. Input: line 1 line 2 line 3 Output:
3 min read
How to Count the Number of Rows in a MySQL Table in Python?
MySQL server is an open-source relational database management system which is a major support for web-based applications. Databases and related tables are the main component of many websites and applications as the data is stored and exchanged over the web. In order to access MySQL databases from a
2 min read
How to Count number of Lines in a Git Repository?
Counting the number of lines in a Git repository can be useful for various reasons, such as understanding the size of the project or monitoring codebase growth over time. There are several methods to achieve this, including using Git command line tools, the CLOC (Count Lines of Code) tool, and custo
4 min read
Count the Number of Null Elements in a List in Python
In data analysis and data processing, It's important to know about Counting the Number of Null Elements. In this article, we'll explore how to count null elements in a list in Python, along with three simple examples to illustrate the concept. Count the Number of Null Elements in a List in PythonIn
3 min read
Count the number of times a letter appears in a text file in Python
In this article, we will be learning different approaches to count the number of times a letter appears in a text file in Python. Below is the content of the text file gfg.txt that we are going to use in the below programs: Now we will discuss various approaches to get the frequency of a letter in a
3 min read
How to count number of NaN values in Pandas?
Let's discuss how to count the number of NaN values in Pandas DataFrame. In Pandas, NaN (Not a Number) values represent missing data in a DataFrame. Counting NaN values of Each Column of Pandas DataFrameTo find the number of missing (NaN) values in each column, use the isnull() function followed by
3 min read
How to get the number of dimensions of a matrix using NumPy in Python?
In this article, we will discuss how to get the number of dimensions of a matrix using NumPy. It can be found using the ndim parameter of the ndarray() method. Syntax: no_of_dimensions = numpy.ndarray.ndim Approach: Create an n-dimensional matrix using the NumPy package.Use ndim attribute available
3 min read
How to Convert Image to PDF in Python?
img2pdf is an open source Python package to convert images to pdf format. It includes another module Pillow which can also be used to enhance image (Brightness, contrast and other things) Use this command to install the packages pip install img2pdf  Below is the implementation: Image can be convert
1 min read
How to Count the Number of Rows of a Given SQLite Table using Python?
In this article, we will discuss how we can count the number of rows of a given SQLite Table using Python. We will be using the cursor_obj.fetchall() method to do the same. This method fetches all the rows of a query result. It returns all the rows as a list of tuples. An empty list is returned if t
2 min read