Progress in 2018

This section will review the highlights of what we accomplished over the last year. These highlights are not exhaustive and I focus on improvements that might encourage people to revisit the language if they were on the fence a year ago.

If you’re already familiar with recent progress in the language and you are more interested in where the language is going then you can jump to the Future direction section.

New language bindings

Several contributors stepped up to the plate to begin three new actively-maintained language bindings to Dhall.

Of these three the Clojure bindings are the ones closest to completion:

The Clojure bindings are sufficiently close to completion that they currently get an official vote on proposed changes to the language standard, giving them an equal voice in the language evolution process.

This is a complete reimplementation of the language entirely in Clojure that allows you to marshal Dhall expressions, including Dhall functions, directly into Clojure:

Haskell - Cabal

This part of the project’s README sold me on the motivation for doing so:

We can go beyond Cabal files. If Cabal is a domain specific language for building Haskell projects, what does a domain specific language for building Haskell web applications look like? Does the separate of library, executable, and test-suite make sense here? Maybe we’d rather:

When you think about this it makes perfect sense: Haskell programmers use Cabal/Hackage to package and distribute Haskell code, but then what do they use to package and distribute “Cabal code”? The answer is a language like Dhall that builds in its own code distribution mechanism instead of relying on a separate build tool. This closes the loop so that you don’t need to maintain a growing tower of build tools as your project expands.

I’m pretty sure Cabal was the first “heavy duty” configuration format tested with Dhall because this project prompted the first swell of feature requests related to interpreter performance and usability improvements for working with giant schemas.

Also, the dhall-to-cabal project includes the entire Cabal schema encoded as a Dhall type, which you can find here:

This comes in handy when you want a systematic listing of all Cabal configuration features. If you have the dhall interpreter installed you can also view the normal form of the schema in all its glory by running:

You can also migrate an existing project using cabal-to-dhall, a tool which converts a .cabal file to the equivalent .dhall file.

Eta

Javier Neira with the support of TypeLead added Dhall as a supported file format for configuring Eta packages by building on top of the dhall-to-cabal project.

That project has also produced work-in-progress Eta and Java bindings to Dhall bindings along the way by using Eta to compile the Haskell implementation of Dhall to the JVM. When those are complete you will have yet another option for using the Dhall configuration language on the JVM

If you are interested, you can follow the progress on those bindings via this GitHub issue:

… and users can easily import that and easily override packages locally without dealing with the headache of rebasing their local changes whenever the upstream package set changes.

Complete language standard

Last year I promised to upstream all features from the Haskell implementation into the language standard, since at the time a few import-related features were implementation-defined. This was a top priority because I didn’t want to treat other language bindings as second-class citizens.

This year we successfully standardized everything, meaning that there should no longer be any implementation-defined features. Additionally, all new functionality now begins with a change to the standard followed by a change to each implementation of the language, meaning that the Haskell implementation is no longer treated as a distinguished implementation.

Unsigned Natural literals

Earlier this year Greg Pfeil proposed to fix the obligatory + sign preceding Natural number literals, which bothered a lot of newcomers to the language. He proposed require a leading + for non-negative Integers instead of Natural numbers.

We knew this would be a highly breaking change, but we were all tired of the + signs which littered our code. The Natural type is much more natural to use (pun intended) than the Integer type, so why not optimize the syntax for Natural numbers?

Simpler Optional literals

In particular, a present Optional literal no longer requires a type since Some can infer the type from the provided argument. This simplifies the common idiom of overriding an absent Optional value within a record of defaults:

Union constructors

My first attempt to improve this introduced the constructors keyword which took a union type as an argument and returned a record of functions to create each constructor. This changed the above code example to:

This solved both the performance issue (by eliminating the need for an intermediate record of constructors) and eliminated the constructors keyword boilerplate. Also, this simplification plays nicely with the auto-complete support provided by the dhall repl since you can now auto-complete constructors using the . operator.

Import caching

In 2017 Dhall added support for semantic integrity checks, where you tag an import with a SHA256 hash of a standard binary encoding of an expression’s normal form. This integrity check protects against tampering by rejecting any expression with a different normal form, guaranteeing that the import would never change.

Several astute users pointed out that you could locally cache any import protected by such a check indefinitely. Even better, the SHA256 hash makes for a natural lookup key within that cache.

We standardized and implemented exactly that idea and now any import protected by an integrity check is permanently cached locally using the standard directory prescribed by the XDG Base Directory Specification.

For example, you can now import the entire Prelude as a package protected by an integrity check:

dhall lint will also automatically simplify any code using the old nested let style to use the new “multi-let” style.

Statically linked executables

The Haskell implementation of Dhall strives to be like the “Bash of typed functional programming”, but in order to do so the implementation needs to small, statically linked, and portable so that sysadmins don’t object to widely installing Dhall. In fact, if the executable satisfies those criteria then you don’t even need your sysadmin’s permission to try Dhall out within your own workspace.

Major performance improvements

… as well as Formation’s internal use of Dhall which has led to them upstreaming many performance improvements to handle large Dhall programs.

Thanks to the work of Fintan Halpenny, Greg Pfeil, @quasicomputational, and others the Haskell implementation is between 1 to 3 orders of magnitude faster than it was a year ago, depending on the configuration file that you benchmark.

We’re also not done improving performance! We continue to improve as new projects continue to stretch the boundaries of what the language can do.

Type diffs

Large projects like these also led to usability improvements when working with gigantic types. The Haskell implementation now displays concise “type diffs” whenever you get a type mismatch so that you can quickly narrow down the problem no matter how much your configuration schema grows. This works no matter how deeply nested the error is.

For example, the following contrived example introduces four deeply nested errors in a gigantic schema (where the type is over 6000 lines long) and the error message still zeroes in on every error:

dhall repl

The Haskell implementation also added a REPL contributed by Oliver Charles that you can use to interactively interpret Dhall code, including sophisticated auto-completion support contributed by Basile Henry:

The REPL comes in handy when exploring large values or types, as illustrated by the dhall-nethack tutorial which uses the REPL:

dhall freeze

Thanks to Tobias Pflug you can also automatically take advantage of Dhall’s semantic integrity checks using the dhall freeze subcommand. This command fetches all imports within a Dhall expression and then automatically tags all of them with semantic integrity checks.

Future direction

Phew! That was a lot to recap and I’m grateful to all the contributors who made that possible. Now we can review where the language is going.

First, I’m no longer benevolent dictator-for-life of the language. Each new reimplementation of the language gets a vote on the language standard and now that the Clojure implementation of Dhall is essentially complete they get an equal say on the evolution of the language. Similarly, once the PureScript bindings and Python bindings are close to complete they will also get a vote on the language standard, too.

However, I can still use this post to outline my opinion of where the language should go.

Crossing the Chasm

A colleague introduced me to the book Crossing the Chasm, which heavily influenced my approach to designing and marketing the language. The book was originally written for startups trying to gain mainstream adoption, but the book also strongly resonated with my experience doing open source evangelism (first for Haskell, and now Dhall).

The book explains that you need to first build a best-in-class solution for a narrowly-defined market. This in turn requires that you think carefully about what market you are trying to address and strategically allocate your limited resources to address that market.

So what “market” should Dhall try to address?

YAML

One of the clearest signals I’ve gotten from users is that Dhall is “the YAML killer”, for the following reasons:

Dhall solves many of the problems that pervade enterprise YAML configuration, including excessive repetition and templating errors

Dhall still provides many of the good parts of YAML, such as multi-line strings and comments, except with a sane standard

Dhall can be converted to YAML using a tiny statically linked executable, which provides a smooth migration path for “brownfield” deployments

Does that mean that Dhall is clearly the best-in-class solution for people currently using YAML?

Not quite. The key thing Dhall is missing for feature parity with YAML is a wide array of native language bindings for interpreting Dhall configuration files. Many people would prefer to use Dhall without having to invoke an external executable to convert their Dhall configuration file to YAML.

This is one of the reasons I’ve slowed down the rate of evolution of the standard so that many of the new language bindings have an opportunity to implement the full standard. Also, I expect that once more language bindings have votes on the standard evolution that will further stabilize the language since new features proposals will have a higher bar to clear.

That’s not to say that we will freeze the language, but instead we will focus on strategically spending our “complexity budget” on features that help displace YAML. If we spend our complexity budget on unrelated features then we will increase the difficulty of porting Dhall to new languages without addressing the initial use case that will help Dhall gain mainstream traction.

JSON integration

One of YAML’s features is that all JSON is also valid YAML, by definition. In fact, some people use YAML just for the fact that it supports both JSON and comments.

This suggests that Dhall, like YAML, should also natively support JSON in some way. Dhall’s issue tracker contains a few issues along these lines and the one I would most like to see completed this year is adding support for importing JSON files as Dhall expressions:

Editor support

Another thing Dhall is missing compared to YAML is widespread editor support. This is why another one of my goals for this year is to create a Dhall language server so that any editor that supports the language server protocol (basically all of them) would get Dhall support for free.

Ops

We can actually narrow down Dhall’s “market” further if we really want to be selective about what we work on. Dhall has also grown in popularity for simplifying ops-related configurations, providing several features that ops engineers care about:

Strong normalization

Ops commonly suffers from the dilemma that too much repetition is error prone, but too much abstraction is also error prone if readers of the code can’t effectively audit what is going on. One of Dhall’s unique features is that all code is strongly normalizing, meaning that every expression can be reduced to an abstraction-free normal form. This is made possible by the fact that Dhall is not Turing-complete (another feature favored by Ops).

Absolute type safety

Ops engineers care about reliability since they maintain the software that the rest of their company relies on and any outages can have devastating effects on both the product and the productivity of other engineers.

This is one of the reason for the Ops flight from Turing-complete languages to inert configuration files like JSON/YAML because Turing-complete languages give you the tools to shoot yourself in the foot. However, Dhall strikes a balance between being programmable while still not being Turing-complete and having a type system with no escape hatches, so you’re incapable of shooting yourself in the foot.

Built-in support for importing code

Another reason that Ops people hate programmable configuration files is that the programming language they pick typically comes with an external build tool for the language that adds one more layer to the tower of build tools that they have to maintain. Now they’ve just replaced one problem (a repetitive configuration file for their infrastructure) which a new problem (a repetitive configuration file for the build tool for the programming language they used to reduce the original repetition).

Dhall solves this problem well by providing built-in language support for importing other code (similar to Bash and Nix, both also heavily used for Ops use cases). This means that Dhall provides a solid foundaton for their tower of automation because they don’t need to introduce another tool to support a growing Dhall codebase.

Dhall displaces YAML well

YAML configuration files are incredibly common in Ops and “infrastructure as code”. Example tools that use a YAML configuration are:

Kubernetes

Docker Compose

Concourse

Ansible

Travis

YAML is so common that Ops engineers sometimes half-jokingly refer to themselves as “YAML engineers”.

We’ve already seen one Dhall integration for an Ops tool emerge last year with the dhall-kubernetes project and this year I hope we continue along those lines and add at least one more Ops-related integration.

I think the next promising integration is the dhall-terraform project which is still a work in progress that would benefit from contributions.

Funding

Finally, I would like to experiment with various ways to fund open source work on Dhall now that the language has a growing userbase. In particular, I’d like to fund:

additional language bindings

better editor support

adding CI support for statically linked Windows and OS X binaries

packaging Dhall for various software distributions (i.e. .rpm/.deb)

… and I’d like to provide some way to reward the work of people who contribute beyond just acknowledging their work in posts like this one.

That’s why one of the survey questions for this year asks for suggestions on what would be the most appropriate (non-proprietary) funding model for that sort of work.

Conclusion

Hopefully that gives people a sense of where I think the language is going. If you have any thoughts on the direction of the language this would be a good time to take the survey: