Easier maintenance of <version> in large projects

Motivation

If you develop a large project with maven, you will typically split your code into various modules organized in a large tree (POM hierarchy). Maven is a great tool to support the development of such project. However there is a major drawback addressed in this toppic:

Every node in your POM hierarchy has a pom.xml with a groupId, artifactId and a version. Besides you have references to other POMs (or artifacts - however you see it) in your pom:

the <parent> section will point to parent project.

the <dependency> sections will point to dependent artifacts.

others such as <plugin> sections that are not addressed by this toppic.

Now the problem is about those references that point to POMs of your own project. For <groupId> that is typically the same throughout the solution is simply to use <groupId>${project.groupId}</groupId> but your groupId will normaly never change anyhow. But on the other hand the <version> typically points to the current local version of the referenced pom. Now the version will change often and this becomes a maintenance problem if you have to spread the versions of your artifacts all over the referencing POMs. I know that this gap of maven is somehow covered by maven-release-plugin but discussions have shown that this is NOT always a suiteable workaround.

So following the maven concept "convention over configuration" maven should offer a way to express a project-internal reference that points to the current version of that module defined in pom.xml without explicitly knowing this version. The suggestion is to be able to omit the version in such case. This would allow to have the version as a single point of information in the according pom while references still have the chance to explicitly specify a (older) version.

Please note that versions can already be omitted if the dependency is versioned via the dependencyManagement-section (e.g. of the parent POM). However this does NOT really help here because it also requires a lot of redundancies and maintenance overhead.

Two views on a POM

At this point we have to distinguish that there are two different points of view when maven is looking at a POM:

For development of a project maven reads pom.xml files from the local disc.

For using artifacts that have already been deployed (or installed) maven retrieves *.pom files from a repository (in the end from local repository).

An important issue for a new maven feature is that it is compatible with other maven versions and therefore will NOT break existing builds in any way. Therefore the suggested feature is planned to be only visible for the development view. To archieve this goal, maven has to be changed in a way such that it automatically adds omitted <version> tags in POMs that are installed in or deployed to a repository. To avoid mistakes or missusage of such feature maven could also reject processing POMs with missing <version> tag, that are not read as pom.xml but retrieved from a repository (including local repo).

The solution in detail

When you call maven from your toplevel project (where the root pom.xml is), it will scan the entire project tree of your project adding all nodes to the reactor. Having this complete reactor, maven is able to determine the version of each module in the reactor from the according pom.xml that has been parsed. So in that case if an individual module is build (actual goal[s] are invoked) maven can logically complete all the omitted <version> tags for modules available in the reactor (This actually means that parsing, processing and validating pom.xml can no more be done in one single step). Now if maven has logically completed the POM, it can do its build as if the missing tags have been there from the start (for install/deploy see below).

But what if maven is invoked on a sub-tree of the project? Here we have to distinguish two things:

A <version> that was omitted in a <parent> section is easy to resolve. You follow the <relativePath> (../pom.xml by default) and look for the parent POM. If it is there, you can read it and have its version (like it already works but the version is then matched to the one specified in <parent>). If it is NOT there maven will simply fail with a message like "Parent POM not found at <relativePath>! You have to specify the version of the parent if it is not locally available".

A <version> that was omitted in a <dependency> section can only be resolved if the referenced modules are resolved. So if it is NOT part of the sub-tree where the build was invoked we have a problem to solve. However this can be done by adding a list of projects named "closure" similar to the reactor but that is build from the toplevel-pom recursively following the <modules>. This list would be lazy evaluated so it only has to be build once when required at all (might fit together with http://jira.codehaus.org/browse/MNG-2675). To say this again with other words: If mvn was called on the toplevel-pom the reactor and the "closure" would be equal. So if an omitted <version> is hit, maven tries to find the project in the reactor. If it is NOT in reactor, maven will get the "closure", that will be build on the first call. Then it tries to find the project in the "closure". If the project was found in the end, the version is filled in - otherwise the build fails.

Another important thing is that ArtifactInstaller and ArtifactDeployer need to guarantee, that the pom.xml is no more copied as is (1:1) but rather a new file is written where the missing tags are added. So in the end some other maven user can NOT see if this feature was used or not and can still use a maven version that does NOT support this feature without problems. For ultimate flexibilty the process should be such that maven keeps the original pom.xml untouched but creates a new one in ${project.build.directory}/pom-transformed.xml (concept already introduced in maven 2.1+) where the omitted <version> tags are filled in. Now additional plugins could potentially do post-processing on that POM (but thats not the point of this toppic). Finally the ArtifactInstaller and ArtifactDeployer will use this new POM rather than the original one.

Examples

To clarify the hole idea it might help to look at some examples:

We assume the following project structure (names are folders and <artifactId>):

4 Comments

omitting version in parent sounds fine, but ommitting the version in dependencies and somehow try to figure that out by reactor is, very bad for ANY IDE integration. It will either completely prohibit opening such project, or will require user intervention and will be slow.

The idea of having properly resolved pom in the repository is not new and I wonder why don't we do it already. We actually should put the completely resolved pom.xml file that has all values interpolated and contains information from all the parents (assuming the parent version is non-SNAPSHOT). From debugging and profiling the NetBeans IDE project loading loop, it seems a lot of time is spend in downloading and processing the dependency pom files, in each build we construct the resolved pom file again and again, while it should be sufficient to do so once (while uploading to the repository)

As I pointed out, there will be no pom with missing version installed or deployed - so if you talk about NetBeans IDE integration is slow by downloading POMs, I see no relation to this proposal. But yes, an omitted version in a dependency will cause additional POM parsing and might make processing a little bit slower. At least my proposal gives an option that can be used by those who like it and others can still ignore it and it will cause no harm. Anyways I think that and IDE integration can solve this in a smart way so there will be no noticable performance difference. For me this is not really an argument against the suggested feature. But again yes, it will add complexity to POM processing logic and this has to be supported by IDE integrations as well. Anyhow I think they all tend to use maven-embedder so there should be no problem.

Something very important: Maybe I caused confusion when I was talking about the "reactor". I do NOT really mean that all projects go into the reactor. I just mean that their POMs are parsed and made available. This could also be something new that has nothing to do with the "reactor".