Fantasy Python Dependency Management Wishes


Nothing in this article refers to real Python examples. This is a collection of thoughts on how I wish managing dependencies in Python worked; not how it actually does.


The problem of managing dependencies can be simplified by focusing on versions, rather than trying to containerize the state of current versions at the time it was built.


The overall idea is that scripts/packages request dependencies with versions, and pip provides them. Environments are protected by searching upwards from specific to more general environments.


Environments


There would be four levels of Python environments. From highest to lowest:



Lower level environments can access higher level environments, but not the other way around.


              ┌─────────────────┐
              │      System     │
              └────────▲────────┘
                       │
              ┌────────┴────────┐
              │      Global     │
              └────▲─────────▲──┘
                   │         │
       ┌───────────┴───┐ ┌───┴───────────┐
       │     User      │ │     User      │
       │    (Alice)    │ │     (Bob)     │
       └─▲───────▲─────┘ └────▲────────▲─┘
         │       │            │        │
┌────────┴┐ ┌────┴─────┐ ┌────┴────┐ ┌─┴────────┐
│  Site   │ │   Site   │ │  Site   │ │   Site   │
│ (Apple) │ │ (Banana) │ │ (Apple) │ │ (Carrot) │
└─────────┘ └──────────┘ └─────────┘ └──────────┘

Some systems such as Windows may exclude the System environment. In this case, the other environment levels are handled the same. Global being the top level does not change its behavior.


Package Versioning


Packages would be installed using a `package_name/version` path inside the target environment.


/usr/lib/python/package_name/1.2.3/package_files_*

Version numbers would be required to use semantic versioning.


Semantic Versioning 2.0.0


This allows multiple versions of the same dependency to co-exist without conflicts.


Installing with pip


Use pip to install things. By default, would try to install the most recent version to the Global environment.


pip install package_x

Global would be kept as default over switching to User to keep working with virtual environments consistent. This is safe since Global does not conflict with System.


Specify target environment with a flag.


pip install package_x --system
pip install package_x --global
pip install package_x --user
pip install package_x --site

Install a specific version.


pip install package_x==1.2.3

Install a less specific version.


pip install package_x>=1.2.3
pip install package_x==1.*

Whenever possible, the most recent version of a dependency is preferred.


PEP 668 `externally-managed`


An `externally-managed` file lock from PEP 668 would prevent the `--system` flag from working. It would have no effect on `--global`, `--user`, and `--site`, since it would not be necessary.


To install to the System environment, the user would have to use their system package manager, such as `apt`.


Packages installed to the System environment would only use the System environment because it is at the top level. It would never search lower level environments for dependencies.


More on `--site`


The site environment flag would tell `pip` to treat the current environment as if there is a Python environment there to install too. If there isn't one, create it.


For example, in your project dir, you could use `pip install package_x --site` to install it into the current directory. When importing, Python would search here first. But other projects without the same site environment won't be forced to use it.


Virtual Environments


Virtual environments would be used as they currently are now.


A virtual environment would have only a self-contained Global environment. There are no users, and it can't reach System. (Using `--user` and `--system` would fail.) You could add a site environment inside the virtual environment, but there's not much purpose to doing so since it is already contained.


From within a virtual environment, Python would only interact with its own environment. This is useful to determine a minimal list of dependencies for a project.


Importing packages


`import` would be updated to function similarly to `pkg_resources.require()`. It would allow specifying a target version or range of versions to search for importing. Preference would be for the most recent version found.


#!/usr/bin/env python3

import package_a                         # Any version
import package_b==1.2.3                  # Specific version
import package_c==1.2.*                  # Any subversion of 1.2
import package_d>=4.1.1                  # Minimum version 4.1.1
import package_e==1.2.*-1.7.4            # Any versions between 1.2 and 1.7.4
import package_f==1.2.3,==1.2.5,>=2.*.*  # One from 1.2.3, 1.2.5, or 2 or greater.

If a version is not specified, it would select the most recent version it can find. If the package cannot be found for the appropriate version, it would raise an error.


It does not matter what environment the dependency is installed to. As long as a dependency is versioned correctly, it should be identical across environments. `package_x/9.8.7` installed in the System environment would be identical to `package_x/9.8.7` installed in the User environment.


Python would not stop at the first valid match. It would check the entire path for all matches and take the most recent version that fits.


Specifying versions for building a package


Version information for dependencies required by a package could be placed in `__init__.py`.


__requires__ = [
	"package_a==1.2.3",
	"package_b>=4.5.6"
]

On run, Python would check for the presence of this and use it to set the versions to search for.


This overrides the default version `import` searches for. If `import` specifies a specific version, it would still search for that version. It would be considered best practice to define package versions only in `__init__.py`.


Alternatively, use `requirements.txt` or `setup.cfg` or whatever else is cool this month.


Installing package dependencies with `pip`


`pip` could check for what packages and version are required for its installed packages by checking each package's `__init__.py`. (Note that `pip` would not check script's `import`s; that's a convenience feature for when working in the User space.)


If downloaded from PyPi, the dependency information is included and `pip` could automatically fetch it, as it does now.


To install the requirements of a local project, the shortcut `pip install . --user` can be used, or `pip install -r --user` to install its requirements.


Register the directory with `pip register ./project_x` to let `pip` know it should check the location when handling dependencies in the future. (If the directory no longer exists, `pip` would remove it from its list to check.)


It is not necessary to keep a cache or ongoing tracking of what versions are installed, but may be desired for performance. Installed packages can be checked deterministically by checking the install sites for a list of packages by directory, then what versions are inside it. Dependencies can also be checked deterministically by checking each installed package's `__init__.py`.


With these steps combined, `pip` can generate a list of what dependencies are required for the packages installed, and install them as necessary.


Environment sharing


A lower level environment can skip installing a package when it's already present in a higher level environment.


For example, if a package is present in the Global space, it does not need to install it to the User space, since Global can be accessed from User.


It can be forced by specifying `--user` to ensure it is installed to the User environment.


Deciding what dependencies to keep


Compare the set of required dependencies against the list of installed dependencies. Anything that's in the required set but not the installed set needs to be installed. Anything in the installed set that's not in the required set can be safely removed.


To satisfy dependency ranges, only the most recent version would be kept.


Any packages manually installed would be marked as such and not automatically removed.


`pip` could include a few commands to perform sanity checks and ensure the above is up-to-date.


pip check package  # Ensure all dependencies are satisfied for package.
pip check          # Ensure all dependencies are satisfied.
pip clean          # Remove all unneeded dependencies.

By default these would not manage the System environment. The `--system` flag can be used to do this as before to target the environment.


The most likely time a dependency may go missing without `pip` noticing is when the system package manager removes it from the System environment. These commands can repair that. Alternatively, consider always keeping an independent set of dependencies in Global so they are always available to `pip`. The solution depends on how independent `pip` should be from the system. Perhaps add hooks to `pip` for other package managers to interact with.


A package could be pinned to ensure it's not automatically managed.


pip pin package==1.2.3    # Pin a specific package version
pip unpin package==1.2.3  # Unpin a specific package version
pip list pins             # List all currently pinned packages

Pins would not prevent newer versions from installing. It would only ensure a specific version is always available until it is unpinned. It would not be possible to pin a version range.


The difference between a pinned package and a manually installed package is important. A pinned package keeps a specific version. A manually installed package ensures there is always at least *a* version of it installed, but can automatically upgrade as needed.


Updating dependencies with `pip`


With `pip` tracking versions, it would be simple to decide if a dependency should be updated. If a newer version of a package is available, and it's within the dependency range of at least one package or manually installed, get it. Deleting the older version is a separate step that can be checked after.


For convenience, `pip` could include a command to automate this. Environment flags could be used if it's desired to only affect a specific environment. By default, the System environment is excluded.


pip upgrade package       # Upgrade a specific package.
pip upgrade-all           # Upgrade all packages and dependencies across all environments except System.
pip upgrade-all --system  # Upgrade the System environment.
pip upgrade-all --user    # Upgrade just the User environment.

Caveats







/gemlog/