Task graph

Continuing from build definition,
this page explains build.sbt definition in more detail.

Rather than thinking of settings as key-value pairs,
a better analogy would be to think of it as a directed acyclic graph (DAG)
of tasks where the edges denote happens-before. Let’s call this the task graph.

Terminology

Let’s review the key terms before we dive in.

Setting/Task expression: entry inside .settings(...).

Key: Left hand side of a setting expression. It could be a SettingKey[A], a TaskKey[A], or an InputKey[A].

Setting: Defined by a setting expression with SettingKey[A]. The value is calculated once during load.

Task: Defined by a task expression with TaskKey[A]. The value is calculated each time it is invoked.

Declaring dependency to other tasks

In build.sbt DSL, we use .value method to express the dependency to
another task or setting. The value method is special and may only be
called in the argument to := (or, += or ++=, which we’ll see later).

As a first example, consider defining the scalacOption that depends on
update and clean tasks. Here are the definitions of these keys (from Keys).

Note: The values calculated below are nonsensical for scalaOptions,
and it’s just for demonstration purpose only:

update.value and clean.value declare task dependencies,
whereas ur.allConfigurations.take(3) is the body of the task.

.value is not a normal Scala method call. build.sbt DSL
uses a macro to lift these outside of the task body.
Both update and clean tasks are completed
by the time task engine evaluates the opening { of scalacOptions
regardless of which line it appears in the body.

Now if you check for target/scala-2.12/classes/,
it won’t exist because clean task has run even though it is inside
the if (false).

Another important thing to note is that there’s no guarantee
about the ordering of update and clean tasks.
They might run update then clean, clean then update,
or both in parallel.

Inlining .value calls

As explained above, .value is a special method that is used to express
the dependency to other tasks and settings.
Until you’re familiar with build.sbt, we recommend you
put all .value calls at the top of the task body.

However, as you get more comfortable, you might wish to inline the .value calls
because it could make the task/setting more concise, and you don’t have to
come up with variable names.

Note whether .value calls are inlined, or placed anywhere in the task body,
they are still evaluated before entering the task body.

Inspecting the task

In the above example, scalacOptions has a dependency on
update and clean tasks.
If you place the above in build.sbt and
run the sbt interactive console, then type inspect scalacOptions, you should see
(in part):

For example, if you inspect tree compile you’ll see it depends on another key
incCompileSetup, which it in turn depends on
other keys like dependencyClasspath. Keep following the dependency chains and magic happens.

val scalacOptions = taskKey[Seq[String]]("Options for the Scala compiler.")
val checksums = settingKey[Seq[String]]("The list of checksums to generate and to verify for dependencies.")

Note: scalacOptions and checksums have nothing to do with each other.
They are just two keys with the same value type, where one is a task.

It is possible to compile a build.sbt that aliases scalacOptions to
checksums, but not the other way. For example, this is allowed:

// The scalacOptions task may be defined in terms of the checksums setting
scalacOptions := checksums.value

There is no way to go the other direction. That is, a setting key
can’t depend on a task key. That’s because a setting key is only
computed once on project load, so the task would not be re-run every
time, and tasks expect to re-run every time.

Running make, it will by default pick the target named all.
The target lists hello as its dependency, which hasn’t been built yet, so Make will build hello.

Next, Make checks if the hello target’s dependencies have been built yet.
hello lists two targets: main.o and hello.o.
Once those targets are created using the last pattern matching rule,
only then the system command is executed to link main.o and hello.o to hello.

If you’re just running make, you can focus on what you want as the target,
and the exact timing and commands necessary to build the intermediate products are figured out by Make.
We can think of this as dependency-oriented programming, or flow-based programming.
Make is actually considered a hybrid system because while the DSL describes the task dependencies, the actions are delegated to system commands.

Rake

This hybridity is continued for Make successors such as Ant, Rake, and sbt.
Take a look at the basic syntax for Rakefile:

The breakthrough made with Rake was that it used a programming language to
describe the actions instead of the system commands.

Benefits of hybrid flow-based programming

There are several motivation to organizing the build this way.

First is de-duplication. With flow-based programming, a task is executed only once even when it is depended by multiple tasks.
For example, even when multiple tasks along the task graph depend on compile in Compile,
the compilation will be executed exactly once.

Second is parallel processing. Using the task graph, the task engine can
schedule mutually non-dependent tasks in parallel.

Third is the separation of concern and the flexibility.
The task graph lets the build user wire the tasks together in different ways,
while sbt and plugins can provide various features such as compilation and
library dependency management as functions that can be reused.

Summary

The core data structure of the build definition is a DAG of tasks,
where the edges denote happens-before relationships.
build.sbt is a DSL designed to express dependency-oriented programming,
or flow-based programming, similar to Makefile and Rakefile.

The key motivation for the flow-based programming is de-duplication,
parallel processing, and customizability.