Software management #
Dependencies #
As a developer, you frequently need to download, install and/or update:
- libraries used in your projects,
- core programming utilities, such as a compiler or interpreter for a given programming language, a package manager, etc.
- tools for software development: editor, debugger, etc.
These programs have their own dependencies, that have their own dependencies, etc. And two programs may depend on different versions of a same third program.
Dependency management is a frequent source of complications during software development. Dependency patterns that may occur include:
-
co-dependencies:
For instance, consider the following configuration:
- Project $P$ depends on a certain version of Library $L_1$,
- $P$ also depends on Library $L_2$, which depends on an older version of $L_1$ (and the two versions are not compatible),
The build of project $P$ may fail because it can contain only one version of $L_1$.
-
cyclic dependencies:
-
Library $L_1$ depends on a specific version of Library $L_2$, and
-
$L_2$ depends on a specific version of $L_1$.
Upgrading one of these two libraries independently may prevent the other one to run. But it may be possible to upgrade both at the same time.
-
-
etc.
Note that in these two examples, dependencies may be direct or transitive. As a result, it can be very difficult to diagnose such problems.
The term dependency hell is sometimes used to refer to such situations.
Two (non-exclusive) broad approaches are commonly adopted to avoid such issues:
- automated dependency management (using a package manager), and
- self-containment: avoid shared libraries, each program having a copy (some of) its own dependencies.
Automated dependency management #
Definitions #
A package is a program together with some metadata. These metadata include the program’s name, version, release date, authors, licence and the names of its dependencies (together with their versions).
A software repository is a collection of packages that comply to the same format of metadata. A software repository (precisely, multiple copies of it) is generally hosted on the cloud.
A package manager automates the installation (as well as configuration, update and removal) of packages from a software repository (or several) to a user’s machine.
Examples #
An operating system (OS) may use a package manager and software repositories. Notably, this is the preferred way of installing software on most Linux distributions. Widely used OS package managers include:
- apt for Debian and derivatives,
- dnf for Fedora and derivatives,
- pacman for Arch and derivatives,
- Homebrew for macOS,
- Chocolatey (since 2011) and Winget (since May 2021) for Windows.
Many programming languages have dedicated package managers. For instance:
- Maven for Java,
- npm for Node.js (Javascript)
- pip for Python,
- NuGet for .NET (C#, F#, and Visual Basic)
- RubyGems for Ruby,
- CPAN for Perl
- CRAN for R,
- etc.
Some applications also have dedicated software repositories for plugins (and plugin managers to handle these plugins). For instance.
- The VSCode market place,
- CTAN for LateX,
- MELPA for Emacs,
- etc.
Other applications (such as zsh, vim or neovim) only have plugin managers, without a centralized software repository. These managers install plugins directly from hosts (e.g. GitHub repositories).
Usage #
Installing, updating and removing software via a package manager is highly recommended in most scenarios. In particular:
- dependencies of a package are also installed (or it some cases updated or removed) transitively,
- some package managers can install and manage several versions of the same package (when needed),
- the installation process often uses a default configuration and directory layout (environment variables, etc.) that facilitates interaction with other programs.
The installation (or update or removal) procedure is also significantly simpler, thus leaving less room for manual errors. As an illustration, here is the full procedure to install Maven with the apt package manager (on Debian and derivatives):
apt install maven
and similarly with homebrew (on macOS):
brew install maven
or with Chocolatey (on Windows):
choco install maven
In comparison, the procedure to install Maven manually on Windows is more involved, thus more likely to introduce errors (trough inadvertence, or by following outdated instructions).
However, in some (rare) scenarios, a manual installation may be preferred. In particular when the latest version of a program is needed, but not available yet on the software repository.
Self-containment #
A variety of strategies can be adopted to build a program so that it runs in partial isolation from the rest of the system that it is deployed on (i.e. in its own environment, and with its own dependencies, that cannot be used by other programs).
For instance, a Node.js project often includes a copy of all the Javascript libraries that it depends on (transitively). Further self-containment strategies may involve different levels of virtualization (e.g. via Docker).
This is one way to avoid shared dependencies. Other benefits are increased portability, and ease of installation by end-users. Drawbacks include and increased workload on the developer’s side (e.g. for maintenance), and limited opportunities for integration with other programs.
Self-containment in Java #
In Java, self-containment is usually less pronounced during development. Java libraries are typically managed via Maven:
- on a per-user basis (which can be viewed as a compromise between per-project and system-wise): each user has a hidden folder
<homeDir>/.m2/
that contains all Java libraries used in his/her projects, and - allowing multiple versions of the same library to coexist.
However, a Java application that targets non-developers can be released together with its Java dependencies, as a so-called über jar (a.k.a. fat jar).