I love python. It's versatility has allowed me to build different things from machine learning models to application databases. I also enjoy writing python just for fun. It's my language of choice for Advent of Code and Project Euler.
But python isn't known for its package management. Data scientists and engineers often spend hours debugging environment issues. "It works on my machine" is too common. There are many tools to help with this, such as virtualenv, conda, poetry, or pipenv. But the fragmentation means more confusion. Each team ends up with their own solution, making collaboration or onboarding even harder.
To improve python developer experience, Astral (the team behind the popular python linter ruff
) released uv
early 2024. uv
is a comprehensive python development tool that replaces pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and more. Written in Rust, it consolidates Python tooling into a single, cohesive experience.
In today's one day tour, we will look at a couple aspects of uv
: package and virtual environment management. And we will test and measure the uv
speed up!
Prerequisites
You should have python installed. And you should also have uv
. If you are using a Mac, you can install uv
via brew:
> brew install uv
You can find more installation options in the uv
documentation.
Virtual Environment
Having helped many team members set up their python environment, I think there are a few reasons why python virtual environment can easily trip up a lot of people:
- Hidden folder: Typically you'd install a virtual environment in a hidden folder (
.venv
). This is the same folder where the installed python packages are stored. Because it is hidden, oftentimes people aren't aware there is an actual place where files and packages live (and can be inspected). In Javascript / Node, you see the installed packages in thenode_modules
folder front and center. In python,.venv
is hidden. - Activation state: unlike containerized environments, virtual environments need to be "activated" and "deactivated". It's easy to forget which environment is active, or whether you're in one at all. This leads to confusion about which packages have been installed and for which versions.
- Python version: Within a virtual environment, the version of Python is the same one used to create the virtual environment. And if you need to get a different version of Python, you might install it via pyenv, brew, or python.org, in addition to the system Python that might have been pre-installed on your machine. This can become especially confusing when different projects need different Python versions, or when different packages have different Python version requirements.
uv Workflow
With uv
, handling virtual environment is a lot less complex. Here is a summary of the differences.
As you can see, there are many steps involved when properly using virtual environment. With uv
, because we no longer have to manage the virtual environment state, we can get to work more quickly.
Faster Package Installation
Not having to manually manage virtual environment means less steps. In addition, the pip installation of dependencies happen in parallel by default in uv
. So let's give it a try and test out the speed up!
It's very common to see a requirements.txt
file when you work on an existing python project. To recreate the development environment, you'd need to first install all the dependencies specified.
Package Installation with pip
So let's start from a requiremnts.txt
, which contains common Data Science libraries:
And we will also set up a new virtual environment. We will use python 3.10.
$ time /usr/local/bin/python3.10 -m venv .venv
/usr/local/bin/python3.10 -m venv .venv 3.69s user 0.97s system 89% cpu 5.190 total
The time
command was to measure how long it took to create the virtual environment. And here, creating a virtual environment took 5.190 seconds.
Now we will activate the virtual environment and install the packages.
$ source ./.venv/bin/activate
(.venv) $ time pip install -r requirements.txt
pip install -r requirements.txt 52.19s user 22.10s system 17% cpu 7:01.24 total
The installation took 7 minutes. Let's now deactivate the virtual environment, and remove the downloads. We will then do the install again and see how caching speeds up the installation second time around.
(.venv) $ deactivate
$ rm -rf ./.venv/
$ /usr/local/bin/python3.10 -m venv .venv
$ source ./.venv/bin/activate
(.venv) $ time pip install -r requirements.txt
pip install -r requirements.txt 60.31s user 24.21s system 26% cpu 5:13.53 total
In this second round of installation, we are automatically using cache. You should see messages like Using cached seaborn-0.13.2-py3-none-any.whl (294 kB)
. And indeed this time it's faster, clocking at around 5 minutes.
Package Installation with uv
Now we will repeat this experiment with uv
. First we clean up the current virtual environment.
(.venv) $ deactivate
$ rm -rf ./.venv/
With uv
, we will similarly set up a virtual environment, and perform pip install. Notice we no longer have to manage the activation and deactivation of the virtual environment.
$ time uv venv --python=/usr/local/bin/python3.10 .venv
uv venv --python=/usr/local/bin/python3.10 .venv 0.01s user 0.02s system 33% cpu 0.090 total
$ time uv pip install -r requirements.txt
uv pip install -r requirements.txt 17.39s user 18.70s system 12% cpu 4:39.27 total
The installation time is now shorter at less than 5 minutes. You might also notice the very intuitive progress reporting:
Again, to test the effect of caching, we will delete the virtual environment with the installed library, and run the installation again.
$ rm -rf .venv
$ uv venv --python=/usr/local/bin/python3.10 .venv
$ time uv pip install -r requirements.txt
uv pip install -r requirements.txt 0.21s user 1.21s system 85% cpu 1.654 total
Woah, that took just a couple of seconds!
Benchmark Comparison
Here is a summary of our experimentation.
Benchmark | pip | uv | speedup |
---|---|---|---|
Create virtual environment | 3.69s | 0.09s | 41x |
pip install (fresh) | 7m 1.24s | 4m 39.27s | 1.5x |
pip install (cached) | 5m 13.53s | 1.65s | 190x (!?) |
Of course, these values will depend a lot on, say, network speed and hardware specification. So you should give it a try!
If you don't need to pip install packages from scratch constantly, then the gain of a few minutes might not be a big deal. However, many python projects in production is also hooked up with Continuous Integration / Continuous Deployment workflows. For example, you should constantly test your python codebase with every piece of change. Each of such test would require building all the project's dependencies, and as a result, the minutes spent running pip install quickly add up. So I find the speed of uv
quite appealing. In addition, in preparation for this one day tour, I also came to appreciate how uv
's outputs and command line interface experiences are quite intuitive.
So give uv
a try; sunscreen not required!