Since mid 2011 I’ve been thinking on and off about this question. There are some package management solutions available for Erlang/OTP already, but none of them really seem to meet my needs. I had been considering writing a new solution from the ground up, but decided to take a pause and engage with members of the open source community first. I reasoned that it’s better to build something that benefits the whole community and supports a wide range of user experiences, rather than just hack something together for my own use. Since the turn of the year, I’ve had some very constructive conversations with the Erlware developers, as well as some recent discussions on erlang-questions about this topic, with Joe Armstrong contributing to the pool of ideas. This post looks at the origin of these conversations, some of the driving forces, and concludes with a review of the direction in which the Erlware developers and I think we ought to consider going in.

I’ve been wanting to spend some time checking out Octopress, and the series on rebar plugins provides me with a good opportunity to do this. Instead of spending my free time writing up a second post this week, I’ve opted to move the series to a custom location and use Octopress as a publishing engine.

My reasons for doing this are several. Firstly, as I mentioned, I wanted a chance to check out the Octopress wrap around Jekyll, and I must admit that so far I’m finding it nice and high level. Secondly, I wanted better handling of code highlighting than the free wordpress account gives me, and the pygments integration in Jekyll does the job very nicely. I also wanted to be able to provide sample code for each of the posts, and by publishing the series using github pages, I can use a single git repository to manage both the sample code and the gh-pages publication branch. All in all, it seems like a pretty neat solution.

This is the first post in a series on customising rebar using plugins. Initially, I want to focus on how plugins work, then move on to see what kind of extensibility they can provide to developers.

Caveats

There are so many caveats to this, I hardly know where to start. Here’s a short-list to begin with:

What’s written here is based on my experience and could be lacking important detail, or even plain wrong!

Rebar’s support for plugins may disappear at any time in the future, as they’re not a fully documented or official feature AFAIK – though this probably isn’t likely in practise, it is possible

Rebar has no official API for plugins, so their use is at best undocumented, and in all cases probably completely unsupported – you will find friends on the mailing list though, I’m sure

I’ve written quite a few plugins by now, and have even submitted a few patches relating to them. It’s fair to say that I’m quite opinionated about how plugins work and how they should work, but it’s also worth remembering that I’m only a very minor contributor and my opinions are just that – my private opinions.

First things first

Currently rebar supports two kinds of extensibility mechanism: Plugin Extensions and Hooks. Hooks are a lightly documented feature, with the only real explanation being the sample rebar config file. We’re not going to cover hooks in much detail, as they are simple enough to understand and are only really applicable to simple scripting tasks that don’t require, for example, cross platform support or complex logic. Plugin extensions on the other hand, are documented (to some extend anyway), and provide a much greater degree of extensibility to developers.

Before we can talk sensibly about plugins, we need to take a look at some of the fundamentals behind rebar, especially its handling of build configuration files and processing of commands. For any given command, say foo, rebar understands the command if (and only if) one of the modules itknows about exports a function with a signature that matches:

the name of the command and arity 2

the name of the command prefixed with pre_, and arity 2

the name of the command prefixed with post_, and arity 2

We’ll be covering how rebar knows about modules later on, but for now we’ll just assume it’s magic. For the command foo to have any meaning then, we’d need at least one module with at least one of the following signatures exported:

Another essential is how rebar handles build configuration. There are four ways that rebar handles configuration settings. Firstly, rebar loads the global config file from $HOME/.rebar/config if it actually exists, otherwise it creates an empty config set. Secondly, rebar loads configurations for any directory by either (a) examining the terms in the local file rebar.config if it exists, or (b) creating an empty config set. Thirdly, when first executing in the current directory (known as base_dir), rebar will check for a special global variable (passed as config= or alternatively -C <config-name> instead) which overrides the name of the config file it should search in. This latter technique is only applied to the configuration in the base_dir.

The fourth approach to configuration handling is not just for initialising new configurations. As rebar executes user commands (e.g., clean, compile, eunit) in a given directory, it uses two special commands to obtain a list of directories that need to be processed before the current one, and providing the current directory is processed without error, afterwards as well. These commands, preprocess and postprocess, can be exported by any module.

When rebar executes, it builds up a list of modules that understand the current command. For each of these modules it tries to call pre and postprocess, then it traverses any pre- directories before handling the current command in the current directory. Once all the pre-processing is done, each module that exports one of the three function signatures compatible with the current command is called (for one or more of the pre_<command>/2, <command>/2 and post_<command>/2 exports) to handle the actual command. The directories returned by any postprocess calls are handled last of all.

What is vital to understand about all of this, is that as rebar traverses the file system, recursively handling any pre- directories, in each new dir it executes with a brand new rebar config set. This config set inherits the parent configuration (i.e., the config set for the parent dir) but can override certain configuration variables by providing it’s own rebar.config file. This is how dependencies and applications stored in sub-directories are handled. The salient points about this mechanism are that

the only configuration file rebar notices in sub-directories is the one named rebar.config

any configuration override (passed with -C for example) is ignored in sub-directories

just because a local rebar.config overrides a variable/setting, this might not be applied

Point #3 is a bit scary if you’re new to rebar, but essentially it is the result of rebar’s config handling module exporting multiple config handling functions, some of which get the local (i.e., the most locally scoped) value, some a list of possible values and others the combined list of all values. Depending on which of these functions a particular command/module uses when reading the configuration, you can potentially see a number of things happen:

you might see the local value (from rebar.config) get applied

you might see the value from the parent config get applied (e.g., if there is no local config)

you might see the local value get ignored

I strongly recommend spending some time looking at rebar’s config module if you’re planning on writing plugins (or using complex plugins written by others), as it’ll save you a lot of head scratching time if you understand this up front.

What are plugins?

As far as rebar is concerned, plugins are simply Erlang modules that it knows something about. There are essentially two ways that rebar knows about modules:

From the rebar.app configuration file

Via the plugins section of the build configuration

Modules which are registered in the rebar.app configuration file are basically part of rebar itself. Plugins on the other hand, are modules which the build configuration (somewhere in the tree) knows about via the plugins configuration element. This configuration is built up to include every branch, including the global config, so just because you’ve got no local plugins configuration, doesn’t mean plugins won’t get run in your subdirectories. In practise, this means that plugins registered up top (e.g., globally or in the base_dir configuration) will get run in all your sub-directories, including of course dependencies. Bare this in mind when using plugins, and take advantage of skip_deps, apps= and skip_apps= where necessary to avoid unexpected things happening in your sub-dirs and deps folders.

To (hopefully) make the differences between plugin extensions and built-in modules a bit clearer, we’re going to classify plugin extensions into three groups, and will hereafter refer to them simply as plugins:

Internal/Built-in

External/Pre-packaged

Local

Let’s look at what these classifications mean in practise, and hopefully get an understanding of the terminology I’ve chosen. Internal (or built-in) modules come bundled as part of rebar itself, and as per the documentation, these are registered in the rebar application config file. The functionality exposed by these modules is available to every rebar user, so they work Out Of The Box. These plugins are the least likely to be used for extending rebar however, because in practise they require you to either (a) maintain a custom fork of rebar or (b) submit a pull request in order for your extension(s) to be accepted as part of the main source tree. It is the other two types of plugin we will be looking at in this post.

External Plugins

Pre-packaged plugins are bundled as separate Erlang/OTP libraries to be installed globally, or included in a project using rebar’s dependency handling mechanism. The latter technique is more useful, as it ensures that someone who fetches your source code to build/install it, will be able to obtain the right plugins without going outside of the project source tree.

The key thing to understand here is that the plugin must be installed somehow in order for rebar to pick it up. We’ve mentioned that rebar knows about plugins because they’re in the {plugins, ListOfPlugins} configuration element, but in practise things aren’t quite that simple. In order for a plugin to actually get executed (in response to a specific command, it’s pre/post hooks or indeed the special preprocess and postprocess commands), it needs to be on the code path! This is fine if the plugin is installed globally into the user’s erl environment (for example by putting it’s application root directory somewhere on the ERL_LIBS environment variable), but not so fine if you’re fetching it into the project’s dependencies. If the dependency is a direct one, then the preprocess handler in rebar_deps will nicely update the code path for all commands, so as long as you’re not trying to make the plugin run before rebar’s built-in modules (which is, in fact, impossible) then it’ll be on the path. This once again doesn’t always work in practise however, because the function that builds up the code path makes no attempt to deal with transitive dependencies. I keep meaning to do a pull request for this, but I’m waiting for others to get through the queue first.

Local Plugins

You probably recall that I mentioned plugins need to be on the code path in order to be executed by rebar? Well thanks to a nifty pull request from yours truly, there is in fact another way. If rebar cannot find a module on the code path matching the name given to the plugins configuration element, it will attempt to locate a source file for the module in either the base_dir or the directory indicated by the plugin_dir config element. If it finds a source file with a matching name, it attempts to compile it on the fly and load the beam code into the running emulator, thereby making the plugin available dynamically.

The aim of local plugins is to provide a mechanism for scripting complex tasks/hooks that apply only to your specific project. This is in contrast with the idea of external/pre-packaged plugins, which provide add-on re-usable features to rebar that can be used across projects.

Next time…

Next time we’ll be looking at the structure of the plugin callback functions and how to use them in practise. We’ll also be taking a whirlwind tour of some of the commonly (re)used rebar modules such as rebar_config, rebar_utils and rebar_log, as well as discussing some of the pros and cons of using plugins and what the current workarounds look like. We’ll finish with a working example of an external plugin that adds new functionality to rebar with all the source code available on github.

A good example of how rebar-plugins can add useful features to your build, the rebar-dist-plugin allows you to produce an archive for your project which can be distributed rather than forcing people to use git/mercurial/etc to obtain and build your sources.

The plugin comes with some pre-defined assemblies (which are the plugin’s unit of configuration) for packaging up a rebar generated release, or project (i.e., the ebin, include and priv directories). Future releases will add other pre-packaged options such as sources, docs and so on.

Using the plugin is pretty simple, and there is some documentation on the project wiki which is mostly up to date.

I started playing with this today, and have come up with a sample application here. The basic concept behind rebar plugins is fairly simple: you refer to them in your rebar.config and they get hooked into the build at execution time. Naturally rebar (which is executed via escript) needs to be able to find the beam code for these (plugin) modules on the code path, so if you’re putting one together specifically for a project, you’ll need to take advantage of rebar’s sub_dirs support in order to pre-compile them before the rest of your code. The sample project does just that, by compiling the build project prior to the rest of the sources. Including it in your lib_dirs also ensures it is on the code path.

So what can you do with your plugins? Plugins do not participate in rebar’s preprocess stage, so they cannot run in isolation from the core (internal) rebar modules – edit: as of a while back, plugins do in fact participate in the pre and post processing via the same callbacks as built in modules. Check out some of my later posts, or better still head over to http://hyperthunk.github.com/rebar-plugin-tutorial/ for more details.

In practise, this means that your plugin can do one of two things:

Hook into an existing command (such as ‘compile’), or

Expose a new command (such as ‘frobble’ in the example code on github)

The second approach comes with (yet more) caveats though: new, custom commands cannot run in isolation. I suspect this is because plugins do not participate in the preprocess stage, or that they’re excluded from the code that identified modules willing to handle a given command, or both. This means that the rebar_frobble plugin from the example project, runs in two contexts:

During execution of the ‘compile’ command, after the other (internal) modules have handled it

After execution of the ‘compile’ command, during execution of the ‘frobble’ command

In practice, this means you can run [rebar compile frobble], but not [rebar frobble]. [UPDATE] If you referenced the plugin in the top level rebar.config, it would remedy this situation, but you don’t always want to do that. This isn’t very intuitive and I suspect the developers may decide to clarify (or change) this behaviour in future. Despite the slightly confusing execution profile, rebar plugins are a very neat way of customising your rebar build. With full access to all the rebar internal modules, as well as the current (local and global) configuration, the plugin author has a lot of flexibility and power at their fingertips. Naturally with great power comes great responsibility, and plugin authors should consider carefully the use of exports besides published command names and their command/2 function interfaces.

A recent post on the erlang-questions mailing list got me thinking about the way that I manage multiple (concurrent) versions of Erlang/OTP at the moment. This only works on unix-like operating systems, but it has been useful until now.

Basically, I choose a common folder, which on OSX tends to be ~/Library/Erlang and somewhere similar on other *nixes. Under this directory I keep a subdirectory into which multiple ERTS versions can be installed and another site directory into which common/shared libraries and applications can be installed.

I then set my $ERL_LIBS environment variable to the site directory and symlink the current folder as I wish. I also configure tools like epm and/or sutro to use the site directory as their target install folder, giving me a consistent way to install things.

The main thing lacking from this approach is that I have control over which libs/apps in the site directory are compatible with which installed versions of ERTS. A good solution to this that doesn’t force me to use an entire tool-chain in order to take advantage of it, sounds very promising.

This will generally still fail at runtime unless you rename (or symlink to) the .dylib you’ve created so that your shared library has the .so extension, for which the erts code is explicitly looking. Caveat: this last point may have been fixed in recent Erlang/OTP releases, but I’m a little out of touch! Using rebar to build your port driver sources circumvents this naming issue either way.