2 minute read

uv for managing python packages

My workflow is basically: python –> Stata –> latex/markdown

I use Stata for estimation and Python for everything before that. Mostly pandas… until recently when polars became mature enough to invest time in. But I’ll have to write about that separately. If you’re using Python you’ll need a package manager and a way to manage virtual environments.

You need this for two reasons:

  1. Package managers are how you download and keep track of python packages like pandas and polars. Mac OS and Linux ship with Python (often referred to as “system Python”), but this does not come with the packages you want to use for data analysis and you don’t want to dump a bunch of packages into this environment.

  2. Virtual environments keep track of what you are and are not using for a particular project. For example, if you wrote a paper in 2015 with pandas, the commands you used do not work on the pandas you would get if you downloaded it today. You’d like to keep that around, and potentially be able to share it with others if they want to replicate your work.

For a long time I’ve been using conda. conda is great at 1. But not so great at 2. Let’s say that I want to create a virtual environment with python 13 and pandas:

I would:

mkdir newproj
cd newproj 
brew install miniconda 
conda init 
conda create --name newproj python=3.13 pandas 
conda activate

and now I’m using this version of python and pandas. I can come back to it when ever I need to, and even if I’ve updated other versions of python, the version in this environment will not have changed. However, if I want to share this with someone else it’s a bit fiddly to make sure that yo’ve actually tracked the precise versions your code depends on. In particular, notice that the information about this virtual environment is not, by default, associated with the project I’m using it in. This is part of point two point 2 and uv solves this. All of the information about the packages you are using is saved in your project by default.

Lets say that you are going to start a project called newproj and you want to use the same tools as I used in the conda example. Here is what you would do with uv

mkdir newproj 
cd newproj 
brew install uv 
uv init 
uv add polars # this is how you 'add' packages
uv venv --python=3.13 
source .venv/bin/activate 
uv sync

Now all of the information about the packages you installed (including the version of pandas that you got but did not specify) are recorded in the newproj folder. So you can share this folder and someone using it would only need to:

cd newproj 
source .venv/bin/activate 
uv sync # TODO: confirm 

and they are using the exact same versions that you are. Also if they are already using UV and have the same versions in another project then they will not have to download a new version!

There are a few drawbacks, in these cases I fall back to conda:

  1. wrds requires scipy which has non-python dependencies and uv doesn’t handle that well… as in: fails to handle it. So I use conda to create an environment that has the wrds module.
  2. that’s it. There is no second one.