Getting an artifact into the maven central repository

There’s a description on how to do this on the sonatype.org site, but it doesn’t go into enough detail for me, so I’m going to write up some notes on it here.

The project I’m opting to upload is a fairly trivial Bandcamp API client.

The Bandcamp API (which is fairly unrelated to the topic of this blog post) allows you to get song lyrics and graphics from the Bandcamp site, which is a bit like MusicBrainz, but shinier, not as complete, and monetised.

The maven artifact that holds the Java binding for the Bandcamp API will live in the groupId:artifactId co-ordinates of com.randomnoun.bandcamp:bandcamp-api-client.

Since I’m going to have slightly higher standards for testing/documentation for these projects it probably helps that the project isn’t that big (about 10 Java classes in total, most of which are transfer objects or POJOs ).

So, in no particular order:

Choose a license

The content of the maven license URL appears be dumped directly into the generated site documentation, so for me this involved creating another page (text / html) out on the web somewhere with the contents of the license (since the standard OSI page has a fair amount of extraneous header/footer/sidebar guff that didn’t transfer cleanly).

I notice bootstrap itself got bumped from 2.3.2 to 3.0.0 a couple of weeks ago so I’m looking forward to finding out how that’s going to break everything I’ve written up until this point, which is one of the perks of writing software that constantly changes whilst you’re attempting to use it.

Turns out that maven has this thing called Doxia which it uses to prevent you from writing HTML to document the thing, instead preferring you to use APT, or FML, or 10 other flavours of home-grown shit text markup which is apparently going to stop everyone from using HTML because (in the humble opinion of the Maven steering committee) these are easier to use than the language that everyone else has been using up until this time.

It creates its own document event model (called the Doxia SinkAPI) which is Apache’s attempt to reinvent the wheel again, causing me to write a doxia module that kind-of-almost allows me to use HTML instead. [2]

The standard javadoc stylesheet these days (lhs) versus the modified stylesheet (rhs)

The site is hosted at http://code.randomnoun.com/(module-name), which in this case will be http://code.randomnoun.com/bandcamp-api-client, and is updated automatically during release by thescp://code.randomnoun.com/var/www/code.randomnoun.com/(module-name) reference in theproject/distributionManagemenet/site/url element in the pom.xml, which required a bit of stuffingaround with ~/.m2/settings.xmlserver credentials and buggerising up the SSH known_hosts and authorized_keys files between the build server and the web server holding the site, then re-adding support for the scp protcol which was dropped by default in mvn 3.

Create a public CVS repository (accessible by both http and pserver protocols)

The way things used to be

So this is how I normally access CVS, using a tube map metaphor, because I think it’s more entertaining than a UML sequence diagram, and it’s the only way I’m going to get remotely near a high speed train now that Tony Abbott is in power.

Access to CVS from machines within the randomnoun corporate firewall

I, dear reader, am on the left hand side of this diagram, and wish to retrieve things from the CVS repository, on the right hand side. To do this I fire up my CVS client, point it at cvs.dev.randomnoun, which is an internal DNS record resolving to 192.168.0.13, which connects to the standard port 2401, which then serves up files from the /var/lib/cvsd folder of bnedev03, which is the VM that holds my sourcecode.

Notice the .randomnoun TLD, which is a measure I use to prevent internal URLs from leaking onto the internet (which usually end with, say, .com or .au).

The way things are going to look from here on

Because the SCM links in the pom.xml are now public, I opted for creating a separate read-only CVS repository for things I’m making publicly available.

This allows me to avoid the horrible latency of cloud-based version control systems, whilst hopefully minimising any data leakage I’d otherwise suffer by hosting it inside the same cvs repository as the rest of my crap other modules of varying code quality.

I’ve created a new internal DNS entry (cvs.randomnoun.com) which resolves to the same IP address as above (192.168.0.13), and an external DNS entry of the same name (cvs.randomnoun.com) which resolves to my externally accessible IP (123.243.191.198).

The public CVS server sits on the same internal VM, but listens on a non-standard port (2402). External connection requests on port 123.243.191.198:2401 are routed to port 192.168.0.13:2402. The read-only cvsd daemon has it’s repository refreshed periodically from the read/write cvsd by a cronjob on the cvs machine (from /var/lib/cvsd to /var/lib/cvsd-public):

Access to CVS from machines outside the randomnoun corporate firewall. The fluffy cloud image above represents the smoke and mirrors that constitute The Internet.

This has the advantage that:

internal updates go to the read/write cvs repository, whereas

external access use the same SCM URL, but is directed to the read-only cvs repository subset, where hopefully things are less likely to go pear-shaped.

Feel free to complain that I use the same tube station icons for processes, machines and file systems above, but let me point out that these are mostly virtual machines and virtual file systems, so it’s more similar under the hood than you might at first think [1].

You should be able to grab the source using the following anonymous checkout:

The sonatype guys were pretty quick in creating access, which was nice. They also changed the applied groupId from ‘com.randomnoun.bandcamp’ to ‘com.randomnoun’, which allows me to create new subgroups automatically; so the applied groupId appears should be the top-level writable groupId for an organisation, rather than the groupId used for an actual maven artifact.

If you’re anything like me, then you’ll find that you’ll need to completely rebuild your staging server with new versions of Java and Maven, but that’s OK since it was running an old, unsupported version of Debian Lenny, so you probably needed to get round to doing that anyway.

You’ll probably also find yourself trying a few dozen ways of getting that plugin to work, before realising that you need to add some undocumented elements to your ~/.m2/settings.xml file.

Copy into the OSSRH staging repository

Since you’re using maven, you’ve probably already got some horribly complex build process surrounding it just to make it more manageable. I use vmaint. It’s tops.

The steps you want to add to your release process should be something similar to the following:

Close and release the OSSRH staging repository

check that the staging repository exists and has the files you uploaded (in this case, the pom, client jar, the sources jar and the javadoc jar)

select your staging repository and click the ‘Close’ button on the toolbar

type a message into the ‘Close Confirmation’ box

check that the Central Sync Requirement Rules have passed

click the ‘Refresh’ button on the toolbar, which should then allow you to

select your staging repository and click the ‘Release’ button on the toolbar

type a message into the ‘Release Confirmation’ box. If the ‘Automatically drop’ checkbox is selected (which it is by default), then your staging repository will be removed from the list after it has been released (it will still get synced to central though).

These steps are shown in the screenshots below (click each screenshot for a closer look):

If everything doesn’t go hunkydory (say you’ve forgotten to document a class, include all the required licenses, or you’ve inadvertently left an API key in the sourcecode), just click the ‘Drop’ button on the toolbar, fix the problem, re-release and deploy it to the staging repository, and repeat until everything’s looking better than average.

Once you’ve gone through that, all thats left is to wait two hours, and see if it’s appeared in the central repository. (If you like, you can use that two hours to construct a list of verbs that convey the concept of copying a file).

I believe this only needs to be done after the first artifact, not for subsequent artifacts.

That’s it

So there you go. Including the com.randomnoun.bandcamp:bandcamp-api-client dependency from any old pom.xml file should now cause maven to automatically download the artifact for you.

The release I’ve put up there (0.0.15) is reasonably complete, and should work, but will probably get a few more small changes as I come to grips with this whole central repository release process before I bump it to 1.0.0.

You’d be surprised how long that took to complete.

The bandcamp artifact mentioned above is now part of the com.randomnoun.bandcamp:bandcamp-api-client artifact, which can be directly referenced in your pom.xml from the maven central repository.

The Doxia HTML module mentioned above is now available via the com.randomnoun.maven.doxia:doxia-module-html artifact, which can also be directly referenced in your pom.xml from the maven central repository.

[1] If I’d thought about this a fraction of a second more, I would have made bnedev03 appear on the tube map as ‘zone 1’ and external access out in the wilderness of ‘zone 2’ somewhere.

[2] I still needed to learn Apache Velocity though, which is possibly the worst templating language that has ever been devised. The Doxia Sink doesn’t allow advanced HTML usage, like, say, a DIV element contained within another DIV element, so you get to do all sorts of creative things with the handful of HTML elements it does recognise, which is the sort of constructive back-bending effort that will be familiar to anyone who has ever tried to write a page that renders correctly in more than two types of browser.