0% found this document useful (0 votes)
14 views

Python Pathlib

Uploaded by

Kintakunte
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Python Pathlib

Uploaded by

Kintakunte
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Python's pathlib module

Trey Hunner
12 min. read • Python 3.9—3.13 • Nov. 18, 2024
Share
Tags

 Files

Does your Python code need to work with file paths?

You should consider using pathlib.

I now use pathlib for nearly all file-related code in Python, especially when I need to
construct or deconstruct file paths or ask questions of file paths.

I'd like to make the case for Python's pathlib module... but first let's look at a cheat
sheet of common path operations.

A pathlib cheat sheet


Below is a cheat sheet table of common pathlib.Path operations.

The variables used in the table are defined here:

>>> from pathlib import Path


>>> path = Path("/home/trey/proj/readme.md")
>>> relative = Path("readme.md")
>>> base = Path("/home/trey/proj")
>>> new = Path("/home/trey/proj/sub")
>>> home = Path("/home/")
>>> target = path.with_suffix(".txt") # .md -> .txt
>>> pattern = "*.md"
>>> name = "sub/f.txt"
Path-related task pathlib approach Example
Read all file contents path.read_text() 'Line 1\nLine 2\n'
Write file contents path.write_text('new') Writes new to file
Get absolute path relative.resolve() Path('/home/trey/proj/readme.md')
Get the filename path.name 'readme.md'
Get parent directory path.parent Path('home/trey/proj')
Get file extension path.suffix '.md'
Get suffix-free name path.stem 'readme'
Ancestor-relative path path.relative_to(base) Path('readme.md')
Path-related task pathlib approach Example
Verify path is a file path.is_file() True
Verify path is directory path.is_dir() False
Make new directory new.mkdir() Makes new directory
Get current directory Path.cwd() Path('/home/trey/proj')
Get home directory Path.home() Path('/home/trey')
Get ancestor paths path.parents [Path('/home/trey/proj'), ...]
List files/directories home.iterdir() [Path('home/trey')]
Find files by pattern base.glob(pattern) [Path('/home/trey/proj/readme.md')
Find files recursively base.rglob(pattern) [Path('/home/trey/proj/readme.md')
Join path parts base / name Path('/home/trey/proj/sub/f.txt')
Get file size (bytes) path.stat().st_size 14
Walk the file tree base.walk() Iterable of (path, subdirs, files)
Rename path path.rename(target) Path object for new path
Remove file path.unlink()
Note that iterdir, glob, rglob, and walk all return iterators. The examples above
show lists for convenience.

The open function accepts Path objects


What does Python's open function accept?

You might think open accepts a string representing a filename. And you'd be right.

The open function does accept a filename:

filename = "example.txt"

with open(filename) as file:


contents = file.read()
But open will also accept pathlib.Path objects:

from pathlib import Path

path = Path("example.txt")

with open(path) as file:


contents = file.read()
Although the specific example here could be replaced with a method call instead of
using open at all:

from pathlib import Path

path = Path("example.txt")
contents = path.read_text()
Python's pathlib.Path objects represent a file path. In my humble opinion, you
should use Path objects anywhere you work with file paths. Python's Path objects
make it easier to write cross-platform compatible code that works well with filenames in
various formats (both / and \ are handled appropriately).

Why use a pathlib.Path instead of a string?


Why use Path object to represent a filepath instead of using a string?

Well, consider these related questions:

 Why use a datetime.timedelta object instead of an integer?


 Why use a datetime.datetime object instead of a string?
 Why use True and False instead of 1 and 0?
 Why use a decimal.Decimal object instead of an integer?
 Why use a class instance instead of a tuple?

Specialized objects exist to make specialized operations easier.


Python's pathlib.Path objects make performing many common path-relations
operations easy.

Which operations are easier? That's what the rest of this article is all about.

The basics: constructing paths with pathlib


You can turn a string into a pathlib.Path object by passing it to the Path class:

>>> filename = "/home/trey/.my_config.toml"


>>> path = Path(filename)
PosixPath('/home/trey/.my_config.toml')
But you can also pass a Path object to Path:

>>> file_or_path = Path("example.txt")


>>> path = Path(file_or_path)
>>> path
PosixPath('example.txt')
So if you'd like your code to accept both strings and Path objects, you can normalize
everything to pathlib land by passing the given string/path to pathlib.Path.

Note: The Path class returns an instance of


either PosixPath or WindowsPath depending on whether your code is running on
Windows.

Joining paths
One of the most common path-related operations is to join path fragments together.

For example, we might want to join a directory path to a filename to get a full file path.

There are a few different ways to join paths with pathlib.

We'll look at each, by using this Path object which represents our home directory:

>>> from pathlib import Path


>>> home = Path.home()
>>> home
PosixPath('/home/trey')
The joinpath method

You can join paths together using the joinpath method:

>>> home.joinpath(".my_config.toml")
PosixPath('/home/trey/.my_config.toml')
The / operator

Path objects also override the / operator to join paths:

>>> home / ".my_config.toml"


PosixPath('/home/trey/.my_config.toml')
The Path initializer

Passing multiple arguments to the Path class will also join those paths together:

>>> Path(home, ".my_config.toml")


PosixPath('/home/trey/.my_config.toml')
This works for both strings and Path objects. So if the object I'm working with could be
either a Path or a string, I'll often join by passing all objects into Path instead:

>>> config_location = "/home/trey"


>>> config_path = Path(config_location, ".my_config.toml")
>>> config_path
PosixPath('/home/trey/.my_config.toml')
I prefer the / operator over joinpath

Personally, I really appreciate the / operator overloading. I pretty much never


use joinpath, as I find using / makes for some pretty readable code (once you get
used to it):

>>> BASE_PATH = Path.cwd()


>>> BASE_PATH / "templates"
PosixPath('/home/trey/proj/templates')
If you find the overloading of the / operator odd, stick with the joinpath method.
Which you use is a matter of personal preference.

Current working directory


Need to get the current working directory? Path() and Path('.') both work:

>>> Path()
PosixPath('.')
>>> Path(".")
PosixPath('.')
However, I prefer Path.cwd(), which is a bit more explicit and it returns an absolute
path:

>>> Path.cwd()
PosixPath('/home/trey')
Absolute paths
Many questions you might ask of a path (like getting its directory) require absolute
paths. You can make your path absolute by calling the resolve() method:

>>> path = Path("example.txt")


>>> full_path = path.resolve()
>>> full_path
PosixPath('/home/trey/example.txt')
There's also an absolute method, but it doesn't transform .. parts into references to
the parent directory or resolve symbolic links:

>>> config_up_one_dir = Path("../.editorconfig")


>>> config_up_one_dir.absolute()
PosixPath('/home/trey/../.editorconfig')
>>> config_up_one_dir.resolve()
PosixPath('/home/trey/.editorconfig')
Most of the time you find yourself using absolute(),
you probably want resolve() instead.

Splitting up paths with pathlib


We commonly need to split up a filepath into parts.

>>> full_path
PosixPath('/home/trey/example.txt')
Need just the filename for your filepath? Use the name attribute:
>>> full_path.name
'example.txt'
Need to get the directory a file is in? Use the parent attribute:

>>> full_path.parent
PosixPath('/home/trey')
Need a file extension? Use the suffix attribute:

>>> full_path.suffix
'.txt'
Need the part of a filename that doesn't include the extension? There's a stem attribute:

>>> full_path.stem
'example'
But if you're using stem to change the extension, use the with_suffix method instead:

>>> full_path.with_suffix(".md")
PosixPath('/home/trey/example.md')
Listing files in a directory
Need to list the files in a directory? Use the iterdir method, which returns a
lazy iterator:

>>> project = Path('/home/trey/proj')


>>> for path in project.iterdir():
... if path.is_file():
... # All files (not directories) in this directory
... print(path)
...
/home/trey/proj/readme.md
/home/trey/proj/app.py
Need to find files in a directory that match a particular pattern (e.g. all .py files)? Use
the glob method:

>>> taxes = Path('/home/trey/Documents/taxes')


>>> for path in taxes.glob("*.py"):
... print(path)
...
/home/trey/Documents/taxes/reconcile.py
/home/trey/Documents/taxes/quarters.py
Need to look for files in deeply-nested subdirectories as well?
Use rglob('*') to recursively find files/directories:

>>> for path in taxes.rglob("*.csv"):


... print(path)
...
/home/trey/Documents/taxes/2030/raw/bank_dividends.csv
/home/trey/Documents/taxes/2031/raw/bank_dividends.csv
/home/trey/Documents/taxes/2031/raw/stocks.csv
Want to look at every single file in the current directory and all subdirectories?
The walk method (which mirrors the use of os.walk) works for that:

>>> for path, sub_directories, files in taxes.walk():


... if any(p.stem.lower() == "readme" for p in files):
... print("Has readme file:", path.relative_to(taxes))
...
Has readme file: organizer
Has readme file: archived/2017/summarizer
Navigating up a file tree, instead of down? Use the parents attribute of
your pathlib.Path object:

>>> for directory in taxes.parents:


... print("Possible .editorconfig file:", directory /
".editorconfig")
...
Possible .editorconfig file: /home/trey/Documents/.editorconfig
Possible .editorconfig file: /home/trey/.editorconfig
Possible .editorconfig file: /home/.editorconfig
Possible .editorconfig file: /.editorconfig
Reading and writing a whole file
Need to read your whole file into a string?

You could open the file and use the read method:

with open(path) as file:


contents = file.read()
But pathlib.Path objects also have a read_text method that makes this common
operation even easier:

contents = path.read_text()
For writing the entire contents of a file, there's also a write_text method:

path.write_text("The new contents of the file.\n")


Many common operations are even easier
The pathlib module makes so many common path-related operations both easier to
discover and easier to read.

Paths relative to other paths


Want to see your path relative to a specific directory?

Use the relative_to method:

>>> BASE_PATH = Path.cwd()


>>> home_path = Path("/home/trey/my_project/templates/home.html")
>>> home_path
PosixPath('/home/trey/my_project/templates/home.html')
>>> print(home_path.relative_to(BASE_PATH))
templates/home.html
Checking for file/directory existence

Need to see if a file/directory exists? There's an exists method, but


the is_file and is_dir methods are more explicit, so they're usually preferable.

>>> templates = Path.cwd() / "templates"


>>> templates.exists()
False
>>> templates.is_dir()
False
>>> templates.is_file()
False
Ensuring directory existence

Need to make a new directory if it doesn't already exist? Use the mkdir method
with exist_ok set to True:

>>> templates.mkdir(exist_ok=True)
>>> templates.is_dir()
True
Need to automatically create parent directories of a newly created directory? Pass
the parents argument to mkdir:

>>> css_directory = Path.cwd() / "static/css"


>>> css_directory.mkdir(exist_ok=True, parents=True)
The home directory

Need to check a config file in your home directory? Use Path.home().

>>> user_gitconfig = Path.home() / ".gitconfig"


>>> user_gitconfig
PosixPath('/home/trey/.gitconfig')
Has the user passed in a path that might have a ~ in it? Call expanduser on it!

>>> path = Path(input("Enter path to new config file: "))


Enter path to new config file: ~/.my_config
>>> path
PosixPath('~/.my_config')
>>> path.expanduser()
PosixPath('/home/trey/.my_config')
Directory of the current Python file

Here's a trick that's often seen in Django settings modules:

BASE_DIR = Path(__file__).resolve().parent.parent
This will set BASE_DIR to the directory just above the settings module.

Before pathlib, that line used to look like this:

BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
I find the pathlib version much more readable.

No need to worry about normalizing paths


All file paths in pathlib land are stored using forward slashes. Why forward slashes?
Well, they work pretty universally on Windows, Linux, and Mac and they're easier to
write in Python (backslashes need to be escaped to use them in strings).

>>> documents1 = Path("C:/Users/Trey/Documents") # preferred


>>> documents2 = Path(r"C:\Users\Trey\Documents") # this works also
though
>>> documents1
WindowsPath('C:/Users/Trey/Documents')
>>> documents2
WindowsPath('C:/Users/Trey/Documents')
When a file path is actually used, if you're on Windows those forward slashes will all be
converted to backslashes by default:

>>> print(documents1)
C:\Users\Trey\Documents
You might ask "why even care about / vs \ on Windows if / pretty much always
works"? Well, if you're mixing and matching \ and /, things can get weird... especially
if you're comparing two paths to see whether they're equal!

With the automatic normalization done by pathlib.Path objects, you'll never need to
worry about issues with mixing and matching / and \ on Windows.

Built-in cross-platform compatibility


Pretty much any sort of splitting or segment-modification operation you can imagine
with a file path is relatively simple with pathlib.Path objects. Using Path objects for
these operations also results in code that is self-descriptive and cross-platform
compatible.

Cross-platform compatible?

Why not just use the split and join string methods to split and join
with / or \ characters?

Because manually handling / and \ characters in paths correctly is a huge pain.

Compare this code:

directory = input("Enter the project directory: ")

# Normalize slashes and remove possible trailing /


directory = directory.replace("\\", "/")
directory = directory.removesuffix("/")

readme_filename = directory + "/readme.md"


To this pathlib.Path-using code:

from pathlib import Path

directory = input("Enter the project directory: ")


readme_path = Path(directory, "readme.md")
If you don't like pathlib for whatever reason, at least use the various utilities in
Python's much older os.path module.

A pathlib conversion cheat sheet


Have the various os.path and os path-handling approaches in muscle memory?

Here's a pathlib cheat sheet, showing the new pathlib way and the old os.path, os,
or glob equivalent for many common operations.

Path-related task pathlib approach Old approach


Read all file contents path.read_text() open(path).read()
Get absolute file path path.resolve() os.path.abspath(path)
Get the filename path.name os.path.basename(path)
Get parent directory path.parent os.path.dirname(path)
Get file extension path.suffix os.path.splitext(path)[1]
Get extension-less name path.stem os.path.splitext(path)[0]
Ancestor-relative path path.relative_to(parent) os.path.relpath(path, parent)*
Verify path is a file path.is_file() os.path.isfile(path)
Verify path is directory path.is_dir() os.path.isdir(path)
Make directory & parents path.mkdir(parents=True) os.makedirs(path)
Path-related task pathlib approach Old approach
Get current directory pathlib.Path.cwd() os.getcwd()
Get home directory pathlib.Path.home() os.path.expanduser("~")
Find files by pattern path.glob(pattern) glob.iglob(pattern)
Find files recursively path.rglob(pattern) glob.iglob(pattern, recursive=Tr
Normalize slashes pathlib.Path(name) os.path.normpath(name)
Join path parts parent / name os.path.join(parent, name)
Get file size path.stat().st_size os.path.getsize(path)
Walk the file tree path.walk() os.walk()
Rename file to new path path.rename(target) os.rename(path, target)
Remove file path.unlink() os.remove(path)
[*]: The relative_to method isn't identical
to os.path.relpath without walk_up=True, though I rarely find that I need that
functionality.

Note that a somewhat similar comparison table exists in the pathlib documentation.

What about things pathlib can't do?


There's no method on pathlib.Path objects for listing all subdirectories under a given
directory.

That doesn't mean you can't do this with pathlib. It just means you'll need to write
your own function, just as you would have before pathlib:

def subdirectories_of(path):
return (
path
for subpath in path.iterdir()
if subpath.is_dir()
)
Working with pathlib is often easier than working with alternative Python tools.
Finding help on pathlib features is as simple as passing a pathlib.Path object to the
built-in help function.

Should strings ever represent file paths?


So pathlib seems pretty great... but do you ever need to use a string to represent a file
path?

Nope. Pretty much never.

Nearly every utility built-in to Python which accepts a file path will also accept
a pathlib.Path object.
For example the shutil library's copy and move functions
accept pathlib.Path objects:

>>> import shutil


>>> path = Path("readme.txt")
>>> new_path = path.with_suffix(".md")
>>> shutil.move(path, new_path)
Note: in Python 3.14, pathlib.Path objects will also gain copy and move methods, so
you won't even need to import the equivalent shutil functions!

Even subprocess.run will accept Path objects:

import subprocess
import sys

subprocess.run([sys.executable, Path("my_script.py")])
You can use pathlib everywhere.

If you really need a string representing your file path, pathlib.Path objects can be
converted to strings either using the str function or an f-string:

>>> str(path)
'readme.txt'
>>> f"The full path is {path.resolve()}"
'The full path is /home/trey/proj/readme.txt'
But I can't remember the last time I needed to explicitly convert
a pathlib.Path object to a string.

Note: Technically we're supposed to use os.fspath to convert Path objects to


strings if we don't trust that the given object is actually a Path, so if you're writing a
library that accepts Path-like objects please be sure to use os.fspath(path) instead
of str(path). See this discussion from Brett Cannon. If you're writing a third-party
library that accepts Path objects, use os.fspath instead of str so that any path-like
object will be accepted per PEP 519.

Unless I'm simply passing a single string to the built-in open function, I pretty much
always use pathlib when working with file paths.

Use pathlib for readable cross-platform code


Python's pathlib library makes working with file paths far less cumbersome than the
various os.path, glob, and os equivalents.

Python's pathlib.Path objects make it easy to write cross-platform path-handling


code that's also very readable.
I recommend using pathlib.Path objects anytime you need to perform manipulations
on a file path or anytime you need to ask questions of a file path.

Mark as read
A Python tip every week
Need to fill-in gaps in your Python skills?

Sign up for my Python newsletter where I share one of my favorite Python tips every
week.

Get weekly Python tips



A Python Tip Every Week

Sign up for w eekly Python tips

Table of Contents

 A pathlib cheat sheet


 The open function accepts Path objects
 Why use a pathlib.Path instead of a string?
 The basics: constructing paths with pathlib
 Joining paths
 Current working directory
 Absolute paths
 Splitting up paths with pathlib
 Listing files in a directory
 Reading and writing a whole file
 Many common operations are even easier
 No need to worry about normalizing paths
 Built-in cross-platform compatibility
 A pathlib conversion cheat sheet
 What about things pathlib can't do?
 Should strings ever represent file paths?
 Use pathlib for readable cross-platform code

Mark as read
Next Up 02:58
Unicode character encodings
When working with text files in Python, it's considered a best practice to specify the character encoding that

you're working with.


© 2024

You might also like