Maven for Noobs (Beginners)

I'm going in the next article (or two, if I feel I go way overboard in length) to explain you Maven as a build tool. At the end you won't be an expert, but you'll have a solid grasp on what it does to build projects, how to investigate what it does, and why it is so popular. So grab you favorite beverage (better be coffee), we're jumping straight in:

Introduction

Maven does three things, really well, which are sort of inter connected:

1. Project Management

The name of the project, who works on it, what SCM are configured for it, licenses, etc.

2. Artifact Repository (Artifact = fancy name for binaries)

Downloading of the various dependencies. Even Maven itself has a lot of its functionality as plugins that will be downloaded.

3. Build System

A bunch of steps to execute to get from a bunch of sources a binary output.

Maven has for a single project only one configuration file, and it must be called pom.xml. POM itself stands for Project Object Model. This is an unfortunate name, it literally just means that all the data for a project is there, touching all the 3 points mentioned (project management, artifact repository dependencies, and the build system).

In this article we will focus only on the 3rd point of the article, namely the build system.

I'm not going to touch yet neither the repository and dependencies, nor project management. With these being said let's go to our...

First Project

The simplest POM file you can write can be similar to this (don't forget to save it as pom.xml into an empty folder):

An extra node, that might be important is the packaging, where you could also specify if the application is a regular .jar (java archive) - this is the default option when not specified, a .war file (web archive), an .ear file (enterprise archive) or a .pom file (a parent project container). For the time being we'll keep to jars, the war/ear are similar, for pom we'll talk when we discuss 1. Project Management.

Now if you run the command (I will explain it immediately, don't worry):

mvn install

You'll see Maven trying to do a lot of things, such as compiling sources, running tests, etc.

This is great, because it allows us to easily start up a new project, and as we will see later, will give us some guarantees on the project structure. Now let's get into what actually happened here, let's see the...

Maven Lifecycle

Maven has a fantastic idea of running build systems, namely that people generally structure their builds sequentially, and tend to do some very common jobs. For example I might want to compile my sources into some bytecode, run some tests against them, and put them in a JAR . Probably 95% of the java projects are right there, but it's not necessary to be so. If it's a Groovy project, i still want to compile my sources to bytecode, it's just the compiler that is different. Maybe it's not even Java, and it's ES6 code, that I want to transpile. Still, maybe we're not done, and want to do more things, such as after the jar is done, to run some integration tests by packaging that jar into a war, then deploying the war file onto some machine. Maven has us (mostly) covered.

The phases of the build that you want to run, such as 'Compile' or 'Run Tests', are standardized in Maven in some phase list that are called lifecycles. Note that a phase is just a name for what should happen, so we have an overall picture. Tasks are what really do the job. Maven offers out of the box three such lists: default, clean and site. The install parameter we passed to mvn is actually the name of a phase that is part of the defaultlifecycle. Maven will automatically find out in what lifecycle does that phase belong, and execute all the previous phases up to, and including the one we passed as a command to maven. (e.g. install)

We can actually ask Maven what phases a lifecycle has, and Maven will also tell us what tasks are assigned to each phase in the lifecycle. Let's ask Maven what goes down in our project when we run mvn install:

This actually makes a lot of sense. First, there are some processed resources, then the sources are compiled. The test resources are processed only if the processing and compilation of the sources passed. That makes sense, since why would we try to run tests, if our code doesn't even compile? Then the tests are compiled (test-compile) and executed (test). If all is fine and dandy by this stage, only then the whole thing packaged into a jar (package), copied into the local repository (install), and eventually deployed on the remote repository (deploy).

Compile will compile by default the sources from src/main/java into target/classes, testCompile will compile by default the test classes from the src/test/java, having the target/classes also in the classpath, and test will just run the tests using by default JUnit 3.

Ok, we know now how to look on what will get executed when we run a command, that is a phase from a lifecycle. But how does the code arrive into our project? Where are these tasks coming from? The answer to that is...

Maven Plugins Goals

Maven plugins are artifacts that are also stored in Maven, that allow Maven do its thing. A plugin can contain multiple tasks, and for some strange reason these tasks are named "goals". For example the default Java compilation plugin is org.apache.maven.plugins:maven-compiler-plugin:3.1. You can see it in the 'compiler' phase above. Let's find out what goals are bundled into it:

Now, if you were paying attention, you've already seen that the testCompile goal is being used also from the org.apache.maven.plugins:maven-compiler-plugin in the test-compile phase. So what you get from that is that in Maven any plugin can have one or more goals, that is basically just some code that will be executed. These goals are plugged in the phases along the lifecycle. This is how we get to compile groovy. Or compile TypeScript to JavaScript. Etc.

Wanna go full Inception? The help:describe command we're calling so far, is also a goal from the help plugin of Maven. Neat!

Ok, but how do we add some new code? For example let's just run a command in the generate sources to list the current folder. We will run ls just before anything. In order to do that, we will plug a plugin that can execute external commands. I picked org.codehaus.mojo:exec-maven-plugin:1.5.0 in order to do that. What we need to do is to plug this somewhere along the execution of the lifecycle. If you look back at the default lifecycle, validate is the first phase that executes in there. We will add our plugin goal in there: (I've added also a picture at the end with the full default Maven lifecycle that you can use as a reference, until you get used to mvn help:describe -Pcmd=install).

The important lines to look for are the plugin configuration, and the executions. It's obvious now what happens when. On the validate phase, we will run the exec goal of this plugin, with some configuration. The configuration is plugin dependent, and you will find it in the documentation of the plugin. You can have multiple executions. That means, you can have the same plugin goals executed multiple times in the lifecycle, even in the same phase.

Effective POM

But still, that doesn't explain where are these other plugins coming from? Where are they plugged in from? To touch this, we would need to look at another facet of Maven, that is Project Management, and that is for the time being outside of the scope of the current article. If you want to have a look on what POM file Maven actually will execute, that includes our plugin, and what plugins and their goals are wired into the lifecycles, you can run mvn help:effective-pom to show what will actually get executed:

In the <build> node, you will see all the <plugins> that were configured, but I'll leave that for the Project Management.

Vocabulary

maven - awesome stuff.

lifecycle - a list of phases that will be executed in order.

phase - a logical name to plug tasks (plugin goals) into.

plugin - a bundle of one or more goals.

goal - task.

Conclusions

Maven can easily run any build flow you can think of by plugging different tasks into the various phases of the plugin, allowing building even the most demanding flows with ease. In this article we had a look on core concepts from the build system aspect of Maven. We got the chance of seeing how to inspect, and customize the execution of a build lifecycle, and understand what happens when we run a command such as install to execute a part of the lifecycle.

The advantage of a system such as Maven is that this knowledge is reusable in all maven projects that build artifacts. You don't need to reverse engineer, you can just ask maven what goals are for what phases in the lifecycle, and immediately have an overview on what's going to be run.