Subscribe to mailing list

Get notified when we have new updates or new posts!

Subscribe Unicorn Data Science cover image
jen@unicornds.org profile image jen@unicornds.org

uv One Day Tour: A fast Python Package Manager

Have you ever spent hours wrestling with Python virtual environments or resolving dependency conflicts? Enter uv: a fast, Rust-based tool that is built to improve nearly all aspects of python development experience.

uv One Day Tour: A fast Python Package Manager
Photo by Claudio Schwarz / Unsplash

I love python. It's versatility has allowed me to build different things from machine learning models to application databases. I also enjoy writing python just for fun. It's my language of choice for Advent of Code and Project Euler.

But python isn't known for its package management. Data scientists and engineers often spend hours debugging environment issues. "It works on my machine" is too common. There are many tools to help with this, such as virtualenv, conda, poetry, or pipenv. But the fragmentation means more confusion. Each team ends up with their own solution, making collaboration or onboarding even harder.

To improve python developer experience, Astral (the team behind the popular python linter ruff) released uv early 2024. uv is a comprehensive python development tool that replaces pip, pip-tools, pipx, poetry, pyenv, twine, virtualenv, and more. Written in Rust, it consolidates Python tooling into a single, cohesive experience.

In today's one day tour, we will look at a couple aspects of uv: package and virtual environment management. And we will test and measure the uv speed up!

Prerequisites

You should have python installed. And you should also have uv. If you are using a Mac, you can install uv via brew:

> brew install uv

You can find more installation options in the uv documentation.

Virtual Environment

Having helped many team members set up their python environment, I think there are a few reasons why python virtual environment can easily trip up a lot of people:

  • Hidden folder: Typically you'd install a virtual environment in a hidden folder (.venv). This is the same folder where the installed python packages are stored. Because it is hidden, oftentimes people aren't aware there is an actual place where files and packages live (and can be inspected). In Javascript / Node, you see the installed packages in the node_modules folder front and center. In python, .venv is hidden.
  • Activation state: unlike containerized environments, virtual environments need to be "activated" and "deactivated". It's easy to forget which environment is active, or whether you're in one at all. This leads to confusion about which packages have been installed and for which versions.
  • Python version: Within a virtual environment, the version of Python is the same one used to create the virtual environment. And if you need to get a different version of Python, you might install it via pyenv, brew, or python.org, in addition to the system Python that might have been pre-installed on your machine. This can become especially confusing when different projects need different Python versions, or when different packages have different Python version requirements.

uv Workflow

With uv, handling virtual environment is a lot less complex. Here is a summary of the differences.

As you can see, there are many steps involved when properly using virtual environment. With uv, because we no longer have to manage the virtual environment state, we can get to work more quickly.

Faster Package Installation

Not having to manually manage virtual environment means less steps. In addition, the pip installation of dependencies happen in parallel by default in uv. So let's give it a try and test out the speed up!

It's very common to see a requirements.txt file when you work on an existing python project. To recreate the development environment, you'd need to first install all the dependencies specified.

Package Installation with pip

So let's start from a requiremnts.txt, which contains common Data Science libraries:

gensim==4.3.3
jax==0.4.30
matplotlib==3.9.4
nltk==3.9.1
numpy==1.26.4
pandas==2.2.3
scikit-learn==1.6.0
scipy==1.13.1
seaborn==0.13.2
spacy==3.8.3
torch==2.2.2

requirements.txt

And we will also set up a new virtual environment. We will use python 3.10.

$ time /usr/local/bin/python3.10 -m venv .venv

/usr/local/bin/python3.10 -m venv .venv  3.69s user 0.97s system 89% cpu 5.190 total

The time command was to measure how long it took to create the virtual environment. And here, creating a virtual environment took 5.190 seconds.

Now we will activate the virtual environment and install the packages.

$ source ./.venv/bin/activate
(.venv) $ time pip install -r requirements.txt

pip install -r requirements.txt  52.19s user 22.10s system 17% cpu 7:01.24 total

The installation took 7 minutes. Let's now deactivate the virtual environment, and remove the downloads. We will then do the install again and see how caching speeds up the installation second time around.

(.venv) $ deactivate
$ rm -rf ./.venv/
$ /usr/local/bin/python3.10 -m venv .venv
$ source ./.venv/bin/activate
(.venv) $ time pip install -r requirements.txt

pip install -r requirements.txt  60.31s user 24.21s system 26% cpu 5:13.53 total

In this second round of installation, we are automatically using cache. You should see messages like Using cached seaborn-0.13.2-py3-none-any.whl (294 kB). And indeed this time it's faster, clocking at around 5 minutes.

Package Installation with uv

Now we will repeat this experiment with uv. First we clean up the current virtual environment.

(.venv) $ deactivate
$ rm -rf ./.venv/

With uv, we will similarly set up a virtual environment, and perform pip install. Notice we no longer have to manage the activation and deactivation of the virtual environment.

$ time uv venv --python=/usr/local/bin/python3.10 .venv

uv venv --python=/usr/local/bin/python3.10 .venv  0.01s user 0.02s system 33% cpu 0.090 total

$ time uv pip install -r requirements.txt

uv pip install -r requirements.txt  17.39s user 18.70s system 12% cpu 4:39.27 total

The installation time is now shorter at less than 5 minutes. You might also notice the very intuitive progress reporting:

Again, to test the effect of caching, we will delete the virtual environment with the installed library, and run the installation again.

$ rm -rf .venv
$ uv venv --python=/usr/local/bin/python3.10 .venv
$ time uv pip install -r requirements.txt

uv pip install -r requirements.txt  0.21s user 1.21s system 85% cpu 1.654 total

Woah, that took just a couple of seconds!

Benchmark Comparison

Here is a summary of our experimentation.

Benchmark pip uv speedup
Create virtual environment 3.69s 0.09s 41x
pip install (fresh) 7m 1.24s 4m 39.27s 1.5x
pip install (cached) 5m 13.53s 1.65s 190x (!?)

Of course, these values will depend a lot on, say, network speed and hardware specification. So you should give it a try!

If you don't need to pip install packages from scratch constantly, then the gain of a few minutes might not be a big deal. However, many python projects in production is also hooked up with Continuous Integration / Continuous Deployment workflows. For example, you should constantly test your python codebase with every piece of change. Each of such test would require building all the project's dependencies, and as a result, the minutes spent running pip install quickly add up. So I find the speed of uv quite appealing. In addition, in preparation for this one day tour, I also came to appreciate how uv's outputs and command line interface experiences are quite intuitive.

So give uv a try; sunscreen not required!