Get the File Extension from a URL in Python
Last Updated :
23 Jul, 2025
Handling URLs in Python often involves extracting valuable information, such as file extensions, from the URL strings. However, this task requires careful consideration to ensure the safety and accuracy of the extracted data. In this article, we will explore four approaches to safely get the file extension from a URL in Python.
Safely Get The File Extension From A Url in Python
Below are some of the ways by which we can safely get the file extension from a URL in Python:
Safely Get The File Extension Using os.path.splitext() Method
The os.path.splitext method provides a simple way to split the file path and extension. It's important to note that this approach doesn't check if the URL points to an actual file; it merely extracts the potential file extension.
Python3
import os
def get_file_extension_os(url):
_, file_extension = os.path.splitext(url)
return file_extension
# Example usage:
url = "https://example.com/path/to/file/document.pdf"
extension = get_file_extension_os(url)
print("File extension:", extension)
OutputFile extension: .pdf
Safely Get The File Extension by Handling Query Parameters
To ensure robustness, it's crucial to handle URLs with query parameters properly. This approach removes query parameters before extracting the file extension, preventing interference.
Python3
from urllib.parse import urlparse
import os
def get_file_extension_query_params(url):
path = urlparse(url).path
path_without_params, _ = os.path.splitext(path.split('?')[0])
_, file_extension = os.path.splitext(path_without_params)
return file_extension
# Example usage:
url = "https://example.com/path/to/file/document.pdf"
extension = get_file_extension_query_params(url)
print("File extension:", extension)
Output:
File extension: pdf
Safely Get The File Extension Using Regular Expressions
For more advanced scenarios, regular expressions can be employed to extract file extensions. This approach allows for greater flexibility and customization.
Python3
import re
def get_file_extension_regex(url):
match = re.search(r'\.([a-zA-Z0-9]+)$', url)
if match:
return match.group(1)
else:
return None
# Example usage:
url = "https://example.com/path/to/file/document.pdf"
extension = get_file_extension_regex(url)
print("File extension:", extension)
OutputFile extension: pdf
Explore
Python Fundamentals
Python Data Structures
Advanced Python
Data Science with Python
Web Development with Python
Python Practice