
Data Structure
Networking
RDBMS
Operating System
Java
MS Excel
iOS
HTML
CSS
Android
Python
C Programming
C++
C#
MongoDB
MySQL
Javascript
PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Extract File Extension Using Python
In a few scenarios, we need to extract the extension of a file to perform specific operations based on its type, such as validating image formats or filtering document files. Python provides different ways to achieve this using the os and pathlib modules. In this article, we'll explore how to get a file's extension with different approaches.
Using os.path.splitext()
The OS file path manipulation is made simple with the help of the Python os.path module. It provides methods to perform operations to receive data from file paths, opening, saving, and updating.
The os.path.splitext() method of the os module in Python is used to split the file name into the name and extension. This method helps us extract the extension part easily from the file name.
Example
In this example, we are using the os.path.splitext() method to get the extension of the given file -
import os filename = "report.pdf" extension = os.path.splitext(filename)[1] print(extension) # Output: .pdf
Here is the output of the above program -
Using pathlib.Path.suffix
The pathlib module provides a more object-oriented way to work with filesystem paths.
The attribute Path.suffix returns the file's extension, including the dot. By simply calling the attributes parent and name within the Path object, we can obtain the parent file path and the actual file name of the provided file path, in addition to the root.
Example
Here is an example using pathlib.Path.suffix to extract the extension of a file -
from pathlib import Path file = Path("image.jpeg") extension = file.suffix print(extension) # Output: .jpeg
Following is the output of the above program -
.jpeg
Using the String split() Method
We can also use the split() method of Python string to extract the extension manually. This approach is simple but may not handle all edge cases, such as multiple dots in filenames.
Example
Below is an example where we extract the extension using the split() method -
filename = "archive.tar.gz" extension = filename.split(".")[-1] print(extension) # Output: gz
Here is the output of the above program -
gz
This method only gives the part after the last dot and does not include the dot in the extension.
Extracting just the extension suffix (without dot)
If you want to remove the dot and extract just the extension suffix, such as py, txt, docx, etc. You need to add "[1:]" after the result[1]) while working with the splitext() method as -
print('Extension:', result[1][1:])
Similarly, while working with pathlib.Path() method add "[1:]" after path.suffix as -
print('Extension:', path.suffix[1:])
Example
The following program demonstrates how to print just the suffixes using both the methods discussed above -
# importing the modules import os import pathlib path = 'D:/test.txt' result = os.path.splitext(path) print('Extension:', result[1][1:]) print('Extension:', pathlib.Path('D:/test.txt').suffix[1:])
Output
Extension: txt Extension: txt