Brett Porter

A Maven-friendly Pattern for Storing Dependencies in Version Control

Advertisements

I feel that I need to start this post with a disclaimer – this is for exceptional use cases only! In particular, take caution:

In any of these cases, it’ll result in a transitive closure that won’t be resolvable if the project is deployed into a repository without copying the repository module around

Over the course of time, the question about why not to store dependencies in version control when using Maven has faded as users have recognised the benefits of having a repository (and particularly a repository manager) in place to maintain their artifacts in a specialised way for the purposes of reuse and reproducibility.

But what happens when that case occurs where that is not practical, such as Greg pointed out in this issue, where you can’t get a repository installed yet but need those dependencies?

A commonly used solution is to have a series of install:install-file executions that manually install the files into the local repository for the build, either using a shell script or very convoluted POM. However, it breaks out of the Maven model of being able to build a multi-module project in one command (or makes it unnecessarily complicated to do so). An example of what results can be seen in the defunct NMaven incubator podling’s bootstrap build.

A better solution is to create a file-based repository stored in version control. By using the native repository mechanism, it ensures dependencies are treated consistently – even if they were to be available from a remote repository in the future. However there can be further problems if you are in a multi-module project. Since it is not recommended to use relative paths outside of the current project, having a single repository for a multi-module build could be complicated again.

The solution I proposed is to create a repository module specifically for housing these special dependencies. This can be included in the build like any other module, and simply needs to be listed first to guarantee that it is built before any other modules. Let’s see how this might look.

Say we have a parent pom.xml:


<project>
  <groupId>com.example</groupId>
  <artifactId>parent</artifactId>
  <version>1.0-SNAPSHOT</version>
  ...
  <modules>
    <module>repository</module>
    <module>modules</module>
  </modules>
</project>

The repository/pom.xml file would then be similar to this:


<project>
  <parent>
    <groupId>com.example</groupId>
    <artifactId>parent</artifactId>
    <version>1.0-SNAPSHOT</version>
  </parent>
  <artifactId>repository</artifactId>
  ...
  <dependencies>
    <dependency>
      <groupId>ancient-artifact</groupId>
      <artifactId>ancient-artifact</artifactId>
      <version>3.1.0.2</version>
    </dependency>
  </dependencies>
  <repositories>
    <repository>
      <id>local</id>
      <url>file:${basedir}/src/repository</url>
    </repository>
  </repositories>
</project>

Here we can see that the dependency (stored in repository/src/repository/ancient-artifact/ancient-artifact/3.1.0.2/ancient-artifact-3.1.0.2.jar) is self-contained to the repository module, and guaranteed to be installed when that project is built. Since it is listed first in parent build, the local repository will be correctly populated before any other modules are built. What’s more, if the dependencies are already present locally this module will skip by extremely quickly.

Note: if you happen to use this technique and also have a repository manager configured using Maven’s mirrorOf settings directive, make sure that you use <mirrorOf>external:*</mirrorOf> to ensure that the file-based repository requests are not passed to the repository manager.

A more complete example can be seen in the project attached to MNG-3989.

Now, while this is likely to be of very limited use, I did find it an interesting exercise to illustrate how you can still map alternate workflows neatly into the Maven paradigm, in a way that doesn’t compromise its artifact flow or build lifecycle, and still gives you a path forward to a different configuration with a remote repository storing the artifacts simply by removing the repository module.

This technique may also have additional utility for those with a policy-driven requirement to use their version control to store dependencies for reproducibility purposes, if they can afford the verbosity of restating all those dependencies in the repository POM.

Of course, I would still recommend using a remote repository wherever possible to manage these artifacts, and to treat its contents on the same level as your version control in your infrastructure if you intend to have a long-lived Maven installation. As noted at the beginning, this becomes a requirement if you want to publish your project as something others may depend on. However, if you need to store your dependencies in version control, this technique may help you at least honour Maven practices in doing so and perhaps reduce an initial adoption curve.

Advertisements