Designing Better Types

Today I want to talk about a programming paradigm that you may or may not have heard about before. It’s named differently by multiple people, so let’s talk about what it is and discuss names later. It is about using (static) types as first class citizens that guide the development process: what this really comes down to is trying to take more advantage of the compiler. We want to leverage the compiler in such a way that it prevents us from compiling invalid programs, as much as we can.

For this article I’ll be using Java, because most people are fairly familiar with it, but this should be fairly easy to apply in any other statically typed language. I’m also using Lombok annotations to spare boilerplate, but I provide links to gists containing equivalent code without Lombok.

Getting started

Basic Example

Let’s go over a very basic example. If you declare a variable of a type and try to assign to it a value of another unrelated type, you would not expect it to compile (for example):

Stringstr=functionThatReturnsInt();

And you would be correct on this assumption. The compiler gives a type error, as the type required for the variable is not the one returned by the function. What if we explore leveraging this kind of error into more scenarios? Without much effort, the compiler already tells us some programs are not valid. If we actually put effort into designing the types maybe we can get much more value out of compilation.

Let’s explore this with a simple builder for a very simple class. Here’s a Person class with two attributes:

However, the builder (if you think about it) does nothing to prevent its misuse. If you consider a valid person only when both the name and agefunctions are called, not much prevents the caller from misuse. Excluding sheer luck, only if the caller is aware of the design pattern or the documentation is correct and he or she reads it, may the code be used as intended. It is going to take a lot more code, but we can split each of the builder’s steps into a type that represents that step.

Now if you start from a new PersonBuilder() you can’t get the type that provides you the build function without calling the name and agefunctions first. And since I placed the build steps in inner classes, it is not possible to start from any other step. The invocation chain has to follow the designed order. If you decide to split each step into different files, you would have to make sure that you couldn’t start from a new PersonBuilderStepAge(person) passing an incorrect person.

Outcomes and final steps

We did lose flexibility on the build order. The caller is forced to first pass the name and then the age, which was not the case before. It is something that is possible to fix. However, if we’re striving for type safety, supporting every build order while maintaining every guarantee we’ve gained so far would mean growing the number of classes exponentially, according to the number of fields of the class we’re building.

Continuing the development of this Builder, the last step would be do have validations for the given name and age and deal with them. In case you’re doing Object-Oriented Programming, this would probably mean throwing one or more Exceptions. If you’re doing Functional Programming, you’d return a data structure that encapsulates the error.

Before further considerations, let’s note a couple of language limitations:

If you consider a Person with null values to be an incorrect instantiation, there’s not much we can do about that in Java besides explicitly making the aforementioned validation. Other languages that were designed with null safety in mind, such as Kotlin, automatically take care of this hassle.

The only way to have safety without losing method name expressivity is to create at least a class for each step. In other languages there might be ways to reduce the boilerplate. In Scala you can do so using Shapeless like this.

With that aside, where is the trade-off? Let’s note the differences of this approach. I’ll leave it to you to reflect on whether each point is a pro or a con.

Testing

We can only test for correct paths, because incorrect ones do not compile. You could try to do a test with a hypothetical wrong usage and check it doesn’t compile, but for sure there would be faulty implementations not covered by your test, so it’s not that great.

Immutability

You might’ve noticed that this approach is only possible with immutability. In the case it is not clear why, let me show you an example. Let’s create a Door that can be closed or open (assume it is somehow relevant for the rest of the program).

And thinking of a sample usage where we have an OpenDoor and want to close it.

OpenDooropenDoor=newOpenDoor();ClosedDoorclosedDoor=openDoor.close();

Now if the close function changed state instead of returning a new ClosedDoor, the previously declared OpenDoor would now be a closed one. But we’ve been striving for correct types, and we cannot allow that! So, the only way we can achieve this is immutability. If there’s no way to instantiate a class in an incorrect state and no way to change its state, there’s no way of going wrong! The previous statement doesn’t take into account runtime reflection, but we pretty much can’t do anything about that.

Interactions with the real world

While this example was pure application logic, you can apply the same principles to types that interact with something outside of the program. For instance, let’s say you want to manage a resource, such as a file. We can create a type such that it’s usage would look like this:

newManagedFile("path/to/file.txt").map(openFile->{// read or write from it});

Here the map function would try to open the file before executing the argument function and would make sure the file is closed afterwards. Which means that the file, if opened, will later be closed. As was the case with the Builder, the error handling would depend on your programming style.

Inherent state

What about inherently stateful structures, such as a Thread Pool? You have two possible paths here. One is searching for a stateless alternative, which does exist in a lot of cases. The other alternative, so that your types signatures do not lie, is to expose in each signature that multiple possible states exist. In the Thread Pool example, the execute function, which would receive the computation to made in one the threads, would have to contain in its signature some way of describing that the computation may not be processed (in case the pool has been shutdown). You can think of it as a validation that the pool is in the intended state. But it must be an explicit validation, so that the caller is able to deal with.

Is this Type Driven Development?

If you’re doing Functional Programming, you can often see this named as Typed Functional Programming.