Python Tooling Guide (Evergreen)

Overview

Last updated July ‘25.

Python has come a long way these last few years, and so have the main bits of tooling that help you scale a project, keeping velocity quick and bugs infrequent. Here, I outline the predominant options in major tooling categories.

Who am I? I’m a Software Engineer at Akkio, a data platform for marketing agencies. I jump all over the stack, but spend much of my time in Python land, focusing on tooling and efficiency improvements that help our other developers work quicker and with fewer bugs.

🌲

This page is evergreen. I come back and update it once or twice a year, and bump the date near the top when I do. Curl back every so often for updated recommendations!

Philosophy

I follow a few main guidelines when it comes to “good” tooling.

Correct, Reliable, and Mature. You don’t want to have to fight your tooling. Good tooling should work consistently and ideally require little configuration in order to achieve a stable and fairly optimal setup.

Performant. Tooling isn’t like CI, where you can generally go jump to something else for a few minutes while you wait. Every bit of time you’re waiting on tooling tends to represent a developer actively waiting. Accordingly, performance and quick feedback loops are a priority.

Great editor support. Working well with prominent editors is valuable. Things like being explicitly written as a LSP and/or having well-written per-editor docs are valuable.

The Categories

Package Manager

uv

uv is awesome and has become the default recommendation within the last ~6mo. The Astral team moves quick and this is now the clear default for anything other than the most complex build processes. It’s extremely fast and works well for just about everything we’ve thrown at it.

Make sure you pin your minor version. It’s pre-1.0 and Astral regularly makes breaking changes on minor, which they’re allowed to do as per semver.

I recommend using the excellent migrate-to-uv tool to get your existing configuration to uv. It’s dead simple to use and totally trivializes the migration process.

Alternatives & Notes

Poetry is also quite good and was my recommendation last iteration of this list. The areas we’ve found where it’s worse than uv would be —

Worse Performance. It’s not crippling, but it’s noticeable.
Worse PEP conformance. Poetry works decently but has occasional gaps around things like the dependency-groups key, where it instead requires a Poetry-specific section.
No build-time dependency specification. This may have changed recently, but during the setuptools fiasco a few months ago, lack of support meant we couldn’t unblock ourselves by constraining its version at build time.
Less feature velocity. Sure, this might just be because it’s fairly feature-complete with no huge issues (which I think devs unfairly frown upon), but especially with tools like uv really taking over the space I would be surprised if it gets more than just patches and minor features.

Otherwise, it’s an excellent package manager that’s simply worked well and generally just stayed out of our way.

pip, which is what many start with, works fine when combined with pip freeze > requirements.txt, but simply isn’t very ergonomic or scalable when it comes to precise dependency pinning and handling transitive dependencies. uv also offers a pip subcommand that is literally just pip but way faster. Pip is fine for basic projects but it doesn’t make sense for actually scaling a long-term project.

pipenv resolves most of the issues that baseline pip has and likely works well at scale but simply was not as intuitive for us as poetry was. Poetry feels a lot more similar to other package managers like npm that many teams will already have a conceptual understanding of.

Linter

Ruff

Unsurprisingly, my foremost recommendation is Ruff. Despite still being pre-v1.0 (pin your minor version!), it’s mature enough for almost any project able to move to it, and the advantages are potent enough to offset most other things in my eyes.

Upsides

Speed. It’s as fast as they promise, and goes through in milliseconds on hundreds of files.
Comprehensiveness. Ruff supports most prominent rules from most prominent rulesets.
Editor Tooling. We’ve never had real issues with Ruff. The VSCode extension works well.

Downsides

Limited customization ability. To my knowledge, you can’t do “custom” rules or things like regex matching directly in Ruff, at least at time of writing. We haven’t had enough of a use case to need to fall back to Pylint or something for that, but more complex linter setups may need to.

Alternatives & Notes

Ruff is honestly the clear leader. Pylint and/or Flake8 are other more traditional linters, though Ruff supports most of their rules natively, so there’s not much of a point.

Formatter

Enforcing a consistent code style is worthwhile — it gets you out of a lot of bikeshedding and cuts down on a lot of useless diff noise in PRs.

Ruff

My foremost recommendation is, again, Ruff. It supports a (mostly) Black-compatible formatter that is quick and works great with editors. If using Ruff as a linter as well, you may need to tweak some parameters, as documented in the Ruff docs here.

I would recommend using Ruff’s isort integration as well. Import sorting eliminates a lot of import churn in PR diffs, letting reviewers get right to the meat of the PR. Ruff supports this both via editor configuration as well as via command line by enabling the "I" rule, doing a check with autofix enabled, then running a format, as outlined in the docs here.

ruff check --select I --fix
ruff format

Alternatives & Notes

Black is a more canonical and also excellent option, but Ruff gets you an intentionally similar (deviations) formatter in a strictly more performant fashion, and only needing one tool instead of two is also nice.

Type Checker

A type checker is the most important way to scale a Python app without constantly reintroducing bugs. It doesn’t help with logic errors, but it helps with a great deal of other “stupid” error classes like types.

Pyright

Pyright ended up being the best of the bunch for us, primary reasons being:

Performance. It wasn’t a magnitude difference like Ruff against all the other linters, but it was noticeably faster than mypy, maybe by half.
Editor Integration. As Pyright was built to be a LSP from the beginning, and it’s actually what underpins VSCode’s Python support, its editor integration is really good.

Consider trying basedpyright, a fork with various improvements and support for directly baking in many proprietary features of Pylance. Compared to Pylance + Pyright, we’ve found basedpyright provides a better CLI experience and only marginally worse VSCode experience. Can’t speak on non-VSCode, but it almost certainly works better by principle of baking more into the LSP itself.

Alternatives & Notes

The predominant other option is Mypy, which has long-since been the canonical type checker for much of Python’s history. It’s still a pretty good option.

However, I ran into a bit to be desired when it comes to things like:

Performance; the daemon (mypyd) is their intended solution for this, but I was never able to get it working smoothly with our project structure. This may have partially been a skill issue resolved with more time, but I ran it for a while and one of my main points here is that you want your tooling to just work, so I see this as the fault of the tool itself too.
Editor Integration; Extensions for both VSCode and PyCharm did not work consistently.\

Astral’s ty and Facebook’s pyrefly aren’t worth using yet, but particularly with Astral’s proven ability to ship absurdly quickly, I have confidence that one of these (probably ty) will become the recommendation probably Q4 2025.

Editor

Good editor setups and smooth setup goes a long way towards keeping your devs productive.

VSCode/PyCharm

If you jump across both Python and TypeScript, VSCode is the clear move, as it works by far the smoothest with TypeScript and has a solid Python experience. If you run a monorepo, make sure you set it up as a multi-root workspace so that you can set different interpreters for different directories, which most extensions play well with.

For a pure Python experience, I’ve found PyCharm to be largely equivalent and another reasonable option.

Consider committing certain bits of editor configuration itself to the repository, for example in a .vscode/settings.json file. You should not be doing this for anything that may be personal preference, but should be doing it for things like automatically running a formatter. A good rule is to do it for anything that is 100% signal i.e. anything that would cause a failing CI suite if somebody doesn’t do it.

Alternatives & Notes

You can of course use something like Vim, though if you’re using Vim or a derivative you probably skipped this section outright.

Other editors I’ve tried — Zed was very cool, but had a noticeably worse Git experience and other small gaps I ended up missing over VSCode, and Cursor runs very stale versions of VSCode extensions like Pylance, which ended up being really annoying.

“AI” Tooling / LLMs

Yes, much of the space is just a solution in search of a problem. However, I do think a few things in the space deserve to be part of your toolset at this point in time.

Copilot or Equivalent

Fancy autocomplete (something like GitHub Copilot, though most editors tend to have their own flavor at this point) is generally worthwhile and helps a lot with boilerplate.

This usually also gets you a LLM chat interface directly in your editor, which I don’t generally find too useful. However, it can be useful to grok confusing code or to get some direction when you “don’t really know what you don’t know”.

Code Review Service

A code review service is worthwhile, though a little bit of custom instruction tuning is worthwhile.

As far as which one — Copilot is decent and has the upside of being built directly into GitHub, but I would instead recommend Diamond (Graphite’s code review bot), which we’ve found to be much higher signal. There are still false positives, but with a bit of custom instruction tuning it’s been quite useful for us and regularly catches bugs.

Alternatives / Future

Agentic tools like Claude Code are interesting and I think will eventually be a larger part of the dev workflow, but I don’t think they make sense quite yet. They aren’t reliable enough and require too much tuning to generally be useful.