One problem that Makefile writers sometimes have is the need to write a single rule that produces multiple output files in order to accommodate tools that don't fit the standard one-command-one-output model generally assumed by Make. Eric Melski takes a look at a few alternatives, including the one and only way to truly capture the relationship in GNU Make syntax.

One problem that Makefile writers sometimes have is the need to write a single rule that produces multiple output files in order to accommodate tools that don't fit the standard one-command-one-output model generally assumed by Make. The classic example is bison, a parser generator used in crafting compilers and interpreters. Bison takes an input file like parser.i and generates both parser.c and parser.h. The Makefile hacker is left with a dilemna: how do you express this relationship in GNU Make syntax? In this article we'll look at the obvious answer and why it is wrong. Then we'll look at a few alternatives, including the one and only way to truly capture the relationship in GNU Make syntax.

The obvious but wrong solution Faced with this problem, many Makefile hackers will write something like this:

Unfortunately this Makefile does not describe a single rule with two outputs, but rather two distinct rules that each have a single output, and that happen to use the same series of commands. In a serial build this distinction is often irrelevant and sometimes even undetectable: although GNU Make will schedule both rules to run, the second rule will never do any work because its output file will already have been updated (by the first rule). But try running this build in parallel with gmake -j 2:

Generating parser.h and parser.c from parser.i Generating parser.h and parser.c from parser.i

Because there are two distinct rules which each update both output files, the files are actually updated twice. In the best case, this just results in a little wasted work. In the worst case, the rules both try to update the output files at the same time, resulting in corrupted output. So, this approach will work if you only ever run serial builds, but nobody does that these days. So what's a Makefile hacker to do? A Crude Fix Our first attempt at fixing the problem is to add a dependency between the two output files. This is a bit crude since there is no actual dependency between the files, but at least it will ensure that the rules run one at a time:

This modification has made the makefile parallel-safe, but it has introduced a surprising side-effect: Even in a serialized build, the files are now generated twice. Go ahead and try it yourself. This is a result of the particular dependency graph algorithms that GNU Make employs. But here's what will really bake your noodle: f you reverse the order in which the output files are listed as prerequisites of all, suddenly the files are generated only once, because those algorithms are very sensitive to the order in which dependencies are declared. So this approach seems to work, but it's too brittle to be considered seriously.

Another attempt It seems that we might be able to fix some of the problems with our first crude attempt by rewriting the makefile so that there are no commands for one of the two targets:

At first this seems pretty good. Now there is only one rule that can update the files, so there's no risk of duplicating work or corrupting outputs. In a from-scratch build, this construct will work fine. But it has some trouble in incremental builds. What happens if you delete just parser.h, but not parser.c? Since there are no commands specified for generating parser.h, make will not know how to produce it, and since parser.c is already up-to-date, it will not run the commands that would generate the files. Even if you explicitly ask make to build parser.h, by running "gmake parser.h", you're stuck:

gmake: Nothing to be done for `parser.h'.

If you only do from-scratch full builds, this solution may work for you, but if you do incrementals as well, use caution. Dummy targets Another approach is to use a dummy target to do the work, and have the actual output files depend on the dummy:

This looks promising: We have a single rule that generates both files, so it will work correctly in a parallel build. But this approach has one significant drawback: it breaks the relationship between the input file and the output files derived from it. We no longer have a depedency that says, "if parser.i is newer than parser.c or parser.h, rebuild those files." Instead, we have a dependency that says, "if parser.i is newer than 'generate_parser' (whatever that is), rebuild it". This makefile will rebuild parser.c and parser.h every time it is run, because make is comparing the times on parser.c and parser.h with generate_parser. Since generate_parser doesn't exist, make will run that rule. It doesn't matter if parser.i is older than parser.c and parser.i, because there is no direct relationship between those files in this makefile.

We can work around this by changing the generate_parser rule so that it also creates a file on disk named "generate_parser"; then on an incremental build, make will see that the file "generate_parser" is newer than parser.i and will not rebuild. But this is messy: we'll have an extra file hanging around that serves no purpose other than to work around a deficiency in the build tool, and we need to remember to manage that file along with the other outputs of the build. It should be deleted by "make clean", for example. And if somebody does something like "touch generate_output" in between builds, that make may not be able to correctly detect that parser.c and parser.h must be rebuilt. As with the previous solution, if you only do from-scratch full builds, this solution will work fine, but with incrementals you need to be careful.

The GNU Make "right" way In GNU Make syntax there is really only one correct way to get a single rule with multiple outputs, and that is to use a pattern rule:

In direct contrast to the first example, this actually will create a single rule that creates two outputs. If you run with gmake -j 2, you'll see the files are updated only once:

Generating parser.h and parser.c from parser.i

This is the only construct that produces the correct dependency graph and behavior. Unfortunately, it has one significant shortcoming: It requires that the input and all the outputs share a common stem, such as parser in our simple examples, so it's not as flexible as we'd like it to be. Still, if your files do fit this restriction, then this is the best solution for you.

Conclusion Now you know some of the ways to create a makefile that generates multiple outputs from a single command. If possible, you should use a pattern rule with multiple outputs. If that doesn't work for you, hopefully one of the alternatives I've shown you will.

About the AuthorEric Melski was part of the team that founded Electric Cloud and is now Architect for ElectricAccelerator. Before Electric Cloud, he was a Software Engineer at Scriptics, Inc. and Interwoven. He holds a BS in Computer Science from the University of Wisconsin.

About the author

Eric is Chief Architect for ElectricAccelerator, a high-performance implementation of make from Electric Cloud, Inc. He obtained a BS in Computer Science from the University of Wisconsin in Madison in 1999. In 2002 Eric co-founded Electric Cloud, where he has spent more than a decade developing distributed, parallel systems designed to accelerate build processes. He is named on seven patents related to his work on build acceleration at Electric Cloud.