Designing for Pipeline Transformations

Pipeline transformations can be a wonderful mechanism for creating self-describing code. Many languages offer a built-in pipe operator. The list includes Elixir, F#, Clojure1, Hack, and shell languages.

The purpose of the pipe operator is to send the output of the left side in as input to the right side. This can make for a more eloquent way of communicating this expression.

Left-to-right

Below is the inverse flow of data. This is the direction data flows in an equivalent expression without the pipe operator.

Right-to-left

This slugify function will convert a raw string into a series of sanitized keywords delimited by dashes. This version, written in Python, is an example of the right-to-left data flow.

It is likely that you find that the piped version is much more expressive and easier to read. The example above is written in Clojure, but as mentioned earlier, other languages also provide a built-in pipe operator. Below are equivalent expressions in Elixir and Bash.

Not only does this version require us to think up a meaningful variable name for each step of the transformation, but it also is just as difficult to read as the original version.

The pipeline operator enables us to write code that is self-descriptive. We can simply let the function names describe the transformations without the need for naming the intermediate representations.

Okay, great! I've sold you on the pipe operator. Now how do we design our APIs in a way that we can use the pipe operator to build out these pipeline transformations?

Below are a handful of things to consider when designing for pipeline transformations.

Transformations are made of transformations

First, keep in mind that each function in a pipeline is itself a transformation. Any one of these functions may contain their own pipeline transformation. This one is pretty simple, but it seems important enough to state.

What kind of transformation is needed?

Essentially, there are two types of transformations.

A → A'

This kind of transformation occurs when the incoming data closely resembles the outgoing data. The slugify example is an instance of an A → A' transformation. We take some text and make a series of alterations to it. On the other side, we still have a string, just a slightly different string.

A string is what went in, but a map was returned. Now that we have a map, any subsequent transformation needs to use a different class of function to operate on this new data structure.

Identify the data under transformation

We need to know what data we are operating on. If we are in a controller of a web app, this might be some sort of a request struct. If we are inside the business logic of that web app, then it might be a domain model. Perhaps we are processing user input, and we are transforming a string.

As simple as it might seem, it is important to make this identification. This will inform our decision about the signature of the functions that will go in our pipeline.

Stick to a single argument

A single argument makes it easier to pipe data through functions. If our functions require multiple arguments, it takes more work to wire them up. Furthermore, extra arguments can take away from the expressiveness of the pipeline. It adds extra information for the reader to process. These extra arguments also have a tendency to creep into collaborating functions.

Yet, sometimes we need more than one piece of data in our transformation. In this scenario, we should attempt to unify our arguments.

Unify common arguments

There are times when our pipelines will need more than one argument. In these cases, consider collecting them into a map or struct.

Take a look at the pipeline in the app function. The collaborating function present needs foo and bar. Consequently, we have to pass both of these into app, even though only one of the functions uses bar.

Take notice of the reduced noise inside the pipeline. This is a simple example, but imagine how complicated things could get if we needed to pass in many arguments.

This is not a silver bullet solution though. We did not eliminate any complexity, we only moved it. By simplifying app, we had to move some complexity down into present. This is a potential trade-off.

Facade Functions

Collecting all the arguments in a map is not always the right solution. Sometimes our transformation functions will need need a one-off argument. For example, perhaps a string operation needs a regular expression.

Clojurists will point out that the -> and as-> macro exist exactly for this design shortcoming. These macros allow you to pipe the data through to the last argument, or whichever argument is specified.

Most languages do not offer this flexibility. They either pipe data through to the first argument or the last argument.

Conclusion

Once we have deemed that a pipeline transformation is the right tool for the scenario, we need to take care to design our functions in a way that they can work together. Identifying what data needs to be transformed and the nature of the transformation is an important first step. This will inform decisions later on about the signatures of the collaborating functions. If possible, limit it to a single argument, which might mean wrapping multiple arguments up in a data structure.

One might say that the pipe operator is really just a mechanism for presenting code. Changing the direction the data flows can have a deep impact on the readability and promote self-description of a piece of code.

However, the pipe operator also encourages a design that promotes a single data structure. This lines up well with the advice of Alan Perlis.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

Keep these considerations in mind the next time you are designing a pipeline transformation with the pipe operator.

1. In Clojure, this operator is referred to as the "thread macro", because it "threads" the argument through the list of functions. This is some unfortunate naming. It has nothing to do with those other threads.