Use pipdeptree to Check Python Package Dependencies and Create a Clean requirements.txt

Learn a handy tool to check package dependencies in Python

Image by Mohamed_hassan on Pixabay

Writing a requirements.txt file from scratch for your application can be cumbersome because you need to sort out what packages need to be installed and what are their dependencies. Worse still, it’s tedious to resolve conflicting dependencies, especially on legacy systems when an older version of pip is used.

Luckily, pipdeptree can help solve this problem. As the name suggests, pipdeptree is a tool that can display Python packages in the form of a dependency tree. It can help solve the dependency issues and create a clean requirements.txt file for our application. In this post, we will introduce the basic usage pipdeptree with a simple example.


Create a virtual environment

pipdeptree can work for packages installed globally on your computer or in a virtual environment. In this post, we will only demonstrate how to use it in a virtual environment because most of the time we should install our application packages in a virtual environment so they won’t impact the libraries needed by the system.

We will use conda to create a virtual environment in this post. With conda, you can specify a Python version when an environment is created. Besides, you can activate a conda environment from anywhere and don’t have to remember the path to the environment folder. Therefore, it’s perfect for development work. However, in production, you may prefer to use venv as it’s more lightweight.

Let’s create a virtual environment called scrapy and specify the Python version to be 3.11:

conda create --name scrapy python=3.11
conda activate scrapy

conda will install a bunch of tools when creating the virtual environment, including pip.

If you try to run pip freeze in the environment now, you will see nothing is printed because we haven’t installed any third-party packages ourselves yet. pip is smart to exclude the basic packages installed by conda automatically.


Install a third-party package

Now let’s install a third-party Python package in our environment. We will install Scrapy which is a popular Python framework for web scraping that has a lot of dependencies.

pip install scrapy

You will see a very long list of packages installed besides Scrapy, all of which are dependencies of Scrapy.

If you run pip freeze now, you will see a long list of all these packages installed. Some inexperienced developers may put this whole list of packages into a requirments.txt file with this command:

pip freeze > requiremnts.txt

However, this is a very bad practice. Firstly, the list is very long and difficult to read and review. You will very likely not know what some packages are used for. And more importantly, it will make the packages very difficult to upgrade. For example, if you want to upgrade the version of Scrapy, you would need to upgrade all the packages that Scrapy is dependent on, which is very difficult and also a waste of time actually.

Luckily, this problem can be solved by pipdeptree. Let’s first install it in our environment and then use it to solve our problem.

pip install pipdeptree 

Use pipdeptree to check package dependencies

Let’s first use pipdeptree to print the dependency tree in our current virtual environment:

$ pipdeptree

pip==23.1.2
pipdeptree==2.10.0
Scrapy==2.9.0
├── cryptography [required: >=3.4.6, installed: 41.0.2]
│   └── cffi [required: >=1.12, installed: 1.15.1]
│       └── pycparser [required: Any, installed: 2.21]
......
├── Twisted [required: >=18.9.0, installed: 22.10.0]
│   ├── attrs [required: >=19.2.0, installed: 23.1.0]
│   ├── Automat [required: >=0.8.0, installed: 22.10.0]
│   │   ├── attrs [required: >=19.2.0, installed: 23.1.0]
│   │   └── six [required: Any, installed: 1.16.0]
│   ├── constantly [required: >=15.1, installed: 15.1.0]
│   ├── hyperlink [required: >=17.1.1, installed: 21.0.0]
│   │   └── idna [required: >=2.5, installed: 3.4]
│   ├── incremental [required: >=21.3.0, installed: 22.10.0]
│   ├── typing-extensions [required: >=3.6.5, installed: 4.7.1]
│   └── zope.interface [required: >=4.4.2, installed: 6.0]
│       └── setuptools [required: Any, installed: 67.8.0]
├── w3lib [required: >=1.17.0, installed: 2.1.1]
└── zope.interface [required: >=5.1.0, installed: 6.0]
    └── setuptools [required: Any, installed: 67.8.0]
wheel==0.38.4

This is a very big tree and only a small part of it is shown above.

If you have multiple third-party libraries installed in your virtual environment, you can use the -p option to specify the package to check:

pipdeptree -p scrapy

You can also use the -r option to show the reverse dependency tree for a package, namely to show what packages are dependent on the specified one. For example, let’s check which packages are dependent on the Twisted package:

pipdeptree -r -p Twisted

Twisted==22.10.0
└── Scrapy==2.9.0 [requires: Twisted>=18.9.0]

It clearly shows that Twisted is required by the Scrapy package. This function is especially useful when you want to prune a requirement.txt file that is not written properly and includes a long list of dependency packages.


Use pipdeptree to create a clean requirements.txt

Finally, let’s use pipdeptree to create a clean requirments.txt that only includes the top-level packages and not their dependencies.

Unfortunately, there is no such command line option of pipdeptree to do it. We need to use grep to extract the information we need.

pipdeptree | grep -E '^\w+'

The regex pattern ^\w+ extracts all the output that starts with a character. This is a simple but effective filter because all the dependencies do not start with a character in the output of pipdeptree as can be seen above.

We can store the output in a requirements.txt file:

pipdeptree | grep -E '^\w+' > requirements.txt

The requirements.txt file generated contains only this content:

pip==23.1.2
pipdeptree==2.10.0
Scrapy==2.9.0
wheel==0.38.4

This is a much cleaner requirements.txt file to read and maintain.


Caveats of pipdeptree

Note that pipdeptree is a great tool to have, but it’s not perfect. Sometimes you may need to have some fine-tuning for the requirements.txt file created. For example, the wheel package is installed by conda automatically but is also included in this file. We can safely remove it from requirements.txt in this case as it’s not used by our application. requirements.txt should only contain the necessary top-level packages for our application.

Besides, when some third-party libraries have dependencies between themselves, the requirements.txt file created may miss some of them and you need to add them manually.

For example, when you install both SQLAlchemy and 
SQLAlchemy-Utils in your virtual environment. pipdeptree would omit SQLAlchemy as it’s a dependency for SQLAlchemy-Utils. However, you may want to include SQLAlchemy explicitly in your requirements.txt because it’s way more important than SQLAlchemy-Utils. It’s good to know what version of SQLAlchemy is used in your application clearly.


In this post, we have introduced how to use pipdeptree to solve Python package dependency issues and create a clean requirements.txt file containing only top-level packages. Some common caveats are also pointed out so you can avoid them in your work. However, we just covered the most common use cases of pipdeptree, which is sufficient in most cases though. If your application needs more advanced settings of pipdeptree, the official documentation can very likely help your out.


Related articles



Leave a comment

Blog at WordPress.com.