Tutorial: writing a custom Dockerfile to build a capsule from a manifest file

Tutorial: writing a custom Dockerfile to build a capsule from a manifest file

With specific focus on pom.xml and build.sbt files.

Written by Shahar Zaks Updated over a week ago

In a capsule's environment area, you will be able to switch between viewing the base environment (including available package managers), the postInstall (if there is one), and the underlying Dockerfile, which is the formal recipe for the environment. In general, a Dockerfile will be accompanied with a warning about editing, e.g.:

We advise you to use available package managers whenever possible. Doing so creates a transparent, user-friendly overview of installed packages, and will also automatically implement certain best practices in the Dockerfile.¹ When something needs more customized installation, the postInstall script should be your next resource (see this article for a tutorial on writing such scripts in general).

Sometimes, however, you will need a hand-edited Dockerfile. One such example is the published capsule TabbyXL: rule-based spreadsheet data extraction and transformation (version 1.0.4). In this capsule's environment, you'll see that the package managers have been disabled, and that there is a pom.xml file beneath the Dockerfile. The Dockerfile has been edited to copy that file into an accessible directory (COPY pom.xml /tmp/ ) and to install the dependencies listed therein (RUN cd /tmp && mvn package && rm -rf /tmp/target ).²

If you have a use case like this, here's how you would address it.

Building a capsule from a pom.xml or build.sbt file:

First, if you have any packages that can be installed via an available package manager, add them while the package managers are still available. This will pin versions when possible and make your Dockerfile clear and easy to read by default.

Second, move your project manifest file to the environment area (it should appear below the Dockerfile).

Third, click the 'Unlock' button in the Dockerfile.

Fourth, add a line to move the manifest file to an accessible location (either COPY pom.xml /tmp/ or COPY build.sbt /tmp/ ).

Fifth, add a RUN command to change into the /tmp directory and install the package's dependencies. For pom.xml, this looks like like RUN cd /tmp && mvn package && rm -rf /tmp/target; for build.sbt, it will look like RUN cd /tmp && sbt clean && sbt compile .

Sixth, edit or modify a run script in the /code folder to make the manifest file accessible again. For a pom.xml file:

This is hard/boring, can you lend a hand?

Absolutely! If you have any questions, please write to us via live chat or an email to support@codeocean.com, and we'll be happy to help.

What will a successful build look like?

A successful build will have clean, transparent code; will install all dependencies as part of the build phase rather than the run phase; and will output an executable that can be used to reproduce concrete results.

Footnotes:

For instance, spacing for readability, pinned versions, and code that cleans and optimizes the environment -- in the screenshot above, rm -rf /var/lib/apt/lists*.

This is preferable to uploading the pom.xml file to the /code folder, and building it from there, because installing those dependencies as part of the build phase rather than the run phase guarantees that all packages are available as part of the environment, and so don't need to be re-downloaded each time (which could fail on account of link rot).