Transitive Data Dependencies
Synopsis
This document explains how to create publish data artifacts with dependencies and how to resolve the dependencies transitively in a consumer project.
Purpose
Maven’s default
life cycle runs several maven plugins that implement conventions for how to process dependencies of type jar
. For example, the maven-compiler-plugin
will add all jar files to the build path such that compilation of the source code will detect when referenced classes and methods do not exist.
However, non-jar dependencies are not automatically handled. This may give rise to the misconception that Maven cannot handle non-Java artifacts. In reality, non-jar dependencies are resolved and downloaded into the local repository as usual. Its just that there are no conventions for what to do with those dependencies afterwards.
For example, consider dependencies of type csv
. Should they be copied to the build output directory? Do they need to go into e.g. the META-INF
folder? Or maybe we need to process the csv files first?
Depending on your use case, any option is reasonable. So we need to provide our own rules about what to do with those artifacts.
Publishing Data Artifacts
You need two modules for this purpose:
- Create a publisher
pom.xml
that attaches the data files - Create a aggregator
pom.xml
that declares the coordinates of the published files as dependencies.
As publisher and aggregator are coupled, you may want to create a parent pom.xml
that declares them both as modules.
Note: It seems its not possible to combine publisher and aggregator into the same pom, because dependencies are resolved before the attachment - effectively causing the dependency resolution will fail.
Copying Data Dependencies to a Directory
We can leverage the maven-dependencies-plugin:copy-dependencies
goal in conjunction with the <includeTypes>
configuration option to copy any (transitive!) dependency of a certain type to a specific directory.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.aksw.data.config</groupId>
<artifactId>aksw-data-deployment</artifactId>
<version>0.0.8</version>
<relativePath></relativePath>
</parent>
<groupId>org.aksw.data.gtfsbench</groupId>
<artifactId>gtfsbench-rdf-1-consumer</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>pom</packaging>
<properties>
<data.workdir>${project.build.directory}/resources</data.workdir>
</properties>
<dependencies>
<dependency>
<groupId>org.aksw.data.gtfsbench</groupId>
<artifactId>gtfsbench-csv-1-deps</artifactId>
<version>0.0.1-SNAPSHOT</version>
<type>pom</type>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<version>3.6.1</version>
<executions>
<execution>
<id>copy-resource-dependencies</id>
<phase>generate-resources</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${data.workdir}</outputDirectory>
<includeTypes>csv</includeTypes>
<stripVersion>true</stripVersion>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>