Extract File Extension Using Python



In a few scenarios, we need to extract the extension of a file to perform specific operations based on its type, such as validating image formats or filtering document files. Python provides different ways to achieve this using the os and pathlib modules. In this article, we'll explore how to get a file's extension with different approaches. 

Using os.path.splitext()

The OS file path manipulation is made simple with the help of the Python os.path module. It provides methods to perform operations to receive data from file paths, opening, saving, and updating.

The os.path.splitext() method of the os module in Python is used to split the file name into the name and extension. This method helps us extract the extension part easily from the file name.

Example

In this example, we are using the os.path.splitext() method to get the extension of the given file -

import os filename = "report.pdf" extension = os.path.splitext(filename)[1] print(extension) # Output: .pdf

Here is the output of the above program -

.pdf

Using pathlib.Path.suffix

The pathlib module provides a more object-oriented way to work with filesystem paths.

The attribute Path.suffix returns the file's extension, including the dot. By simply calling the attributes parent and name within the Path object, we can obtain the parent file path and the actual file name of the provided file path, in addition to the root.


Example

Here is an example using pathlib.Path.suffix to extract the extension of a file -

from pathlib import Path file = Path("image.jpeg") extension = file.suffix print(extension) # Output: .jpeg

Following is the output of the above program -

.jpeg

Using the String split() Method

We can also use the split() method of Python string to extract the extension manually. This approach is simple but may not handle all edge cases, such as multiple dots in filenames.

Example

Below is an example where we extract the extension using the split() method -

filename = "archive.tar.gz" extension = filename.split(".")[-1] print(extension) # Output: gz

Here is the output of the above program -

gz

This method only gives the part after the last dot and does not include the dot in the extension.

Extracting just the extension suffix (without dot)

If you want to remove the dot and extract just the extension suffix, such as py, txt, docx, etc. You need to add "[1:]" after the result[1]) while working with the splitext() method as -

print('Extension:', result[1][1:])

Similarly, while working with pathlib.Path() method add "[1:]" after path.suffix as -

print('Extension:', path.suffix[1:])

Example

The following program demonstrates how to print just the suffixes using both the methods discussed above -

# importing the modules import os import pathlib path = 'D:/test.txt' result = os.path.splitext(path) print('Extension:', result[1][1:]) print('Extension:', pathlib.Path('D:/test.txt').suffix[1:])

Output

Extension: txt
Extension: txt
Updated on: 2025-05-07T14:01:01+05:30

265 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements