Menu

A first post, maybe?

Introduction

The type system of C# is something we all take for granted. Back in the old days it required pretty verbose code, which is one of the reasons why some people migrated to dynamically typed languages such as Python and JavaScript. These days, with automatic type inference, verbosity is not as big an issue anymore. And yet, the type system is as powerful a tool as ever. I like thinking about ways in which the type system can make our lives as developers easier instead of harder. We all know IntelliSense works a lot better for statically typed languages, but the possibilities go far beyond. During my research, I noticed that the functional programming community has made a lot more progress on this front than the object-oriented community, so I took quite a bit of inspiration from them.

Today I would like to present a concept that has been in many statically typed functional languages since version 1, but never made it fully featured into C#. I am talking about optional values that are explicitly encoded in the type system. Optional values have always been easy to implement in most mainstream object oriented languages using null, but – as I will explain in this blog post – this is not always a good idea. Fair warning for dynamic typing aficionados: this will not be your cup of tea.

History

Let us start with a short history lesson. It was Sir Charles Antony Richard “Tony” Hoare who invented null in 1965 when he was designing a type system for references in object oriented languages. Back then, he was already advocating strong type safety, meaning the compiler would assist the developer by checking for impossible operations. For a large part, he succeeded. After all, when have you last encountered a MissingMethodException without using reflection in a statically typed programming language? Unfortunately, null somehow still made its way into his design “because it was so easy to implement”. As we now know, many more recent object oriented languages such as C++, C# and Java inherited this concept without a second thought. With it also came the NullReferenceException, thrown when attempting to dereference a null pointer. Contrary to the MissingMethodException, you have probably encountered this exception very recently if you are a professional C# developer.

Jumping forward a couple of decades to 2009, prof. Hoare publicly apologized for what he now calls his “billion dollar mistake”. He came up with this name when he imagined a world without null and without NullReferenceExceptions and how much money could have been saved compared to the current state of affairs. After all, exceptions occur at runtime whereas type errors are signaled at compile time. The difference between the two is that a runtime exception might slip by the QA department, while a compile time error will never reach the client – even if your organisation lacks a QA department.

I call it my billion-dollar mistake.

Problems and solutions

The fundamental problem with null is that it circumvents the type system; it is a backdoor in an otherwise safe system. In academic jargon, null has the “bottom type” . In the ideal world, any variable of reference type T can only contain references to values of type T. The real world is similar, except for the fact that such variable can also contain null at any point in time. This is a big deal: to write foolproof code we must check for null every time we dereference a pointer. This is tedious work and – understandably – few developers strictly follow this rule.

Take the following code as example:

//attempt 1
var station = person.Vehicle.Stereo.CurrentRadioStation;

This code is not safe because not every person has a vehicle and not every vehicle has a stereo system. In darker times (i.e. before C#6) we would solve this by programming defensively like so:

The code is now safe (assuming person != null), but is comes at a tremendous cost. The decrease in readability – and consequently also maintainability – by going from one line to ten lines cannot be overstated. Extrapolating this to the whole solution, the code becomes very verbose and littered with statements that have little to do with what you are actually trying to accomplish.

Recently, the “safe navigation operator” ?. was added to the C# language to complement the existing “null-coalescing operator” ??. Together they can make the code more readable again.

An alternative solution

There we have it: safe and readable code. What more could we possible want? To answer that question, let us look at an alternative way of accomplishing the same result that is already commonplace in C#. Consider value types, those defined by structs, and how they deal with optional values. Variables with a value type T cannot be assigned the null reference for the simple reason that it cannot contain references, only values. If left unassigned, such variable will contain default(T) instead. However, there are valid use cases where optional structs are needed. To alleviate this, the C# language designers introduced a generic struct: Nullable of T where T : struct. At the same time, they introduced some syntactic sugar that shortens Nullable of T to T?.

When you pass a DateTime struct into a method, you can be sure that it has a year, month and day, among other properties. Even better, when you pass in a DateTime?, IntelliSense will tell you that those same properies are not directly available. This prevents the programmer from making a potential mistake. Instead, the developer should first check the HasValue property and only if true, access the Value property under which the original properties reside. Some perceive this as cumbersome, I consider it to be an integral part of type safety. A good rule of thumb when designing any new system, is to try and make illegal states unrepresentable. This is a great small-scale implementation of this rule.

Consider the following example:

DateTime? GetDueDate() { ... }
void Print(DateTime date) { ... }

DateTime? date = GetDueDate();
Print(date);

This code will not compile because DateTime? cannot implicitly be converted into DateTime. This is great: the compiler protects us against potential mistakes! Instead, we have to do something like this:

Imagine that DateTime were a class instead of a struct. The compiler would not have complained at all, and it would be up to the developers to detect their mistake. This is something I still miss in C#6 even though we now have the safe navigation operator. That operator solves the readability problem, not the type safety problem.

In summary:

Safe

Readable

Type safe

Naive

X

If guards

X

Safe navigation operator

X

X

Nullable structs

X

+/-

X

The C# language designers cannot take full credit for this “invention”. Statically typed functional languages such as Haskell, OCaml and F# have included similar constructs from very early versions onward. These functional counterparts are even more powerful than Nullable because they apply to all types equally and define many useful functions for handling optional types in various circumstances. Note that this is not state-of-the-art technology. Haskell (1990) is almost 3 decades old while F# (2005) is entering puberty.

Our DBA colleagues are also well acquainted with the concept of optionals – if in a slightly different context. Many if not all classic database systems have optionality built-in. Every table follows a strict schema that indicates among other things whether a column can contain null. We often model classes in our solution after those tables. It borders on the absurd that C# never made it easy for us to also model the optionality constraints in code.

A mathematical reason for using Nullable

Functional languages traditionally have very advanced type systems, and developers who use those languages typically manage to leverage their power much more effectively than the average C# developer. For example, it is common practise in functional languages to focus on what we call “total functions”. The concept is borrowed from mathematics and has the same meaning there: total functions can generate suitable output for every possible value of its input type, whereas a partial function can only handle a certain subset of values of its input type. The obvious advantage is that it is always safe to call a total function as long as the types match; you do not have to perform any kind of upfront checks. (Note: in the next paragraphs I will assume functions only have one input parameter. If more are needed, you could theoretically wrap them all in a Tuple and pass that one tuple or curry the function instead.)

A classic example is the square root function . As we all know, the square root of negative numbers is undefined in , so this is a partial function. In pure math we can define it as a total function by limiting its domain: . In code the solution is not as straightforward because there is no such thing as an unsigned double to represent . Besides, not every function makes it so easy to identify processable values in advance. Consider a method User GetUser(int id). Some identifiers will exist and others will not. In general we cannot know this until we run the query.

As an alternative solution, we can opt to extend the output range of the partial function instead of constraining the input side. For the square root example, this would become a function . In code we do not have complex or imaginary users, so we return null or throw an exception. In a sense this also makes our function total again, but only if we understand output T to mean “T or null or Exception“. The more elegant solution would be to assume T just means T, which is roughly what the compiler does when it performs its type checks. From that point of view, we need another strategy to really make our function total. If we could change our method signature to Nullable of T GetUser(int id), both problems would be solved. The function would be total and the compiler could help us because we are very explicit about our expectations in the type signature.

As we know by now, this only works if User is a struct. But it does not have to be this way. Anyone can implement their own Nullable of T and simply drop the where T : struct constraint. This is exactly what many functional languages do: they offer Option or Maybe out of the box.

Beyond Nullable to a comprehensive solution

So far we have established that Nullable is a fairly powerful concept, notwithstanding its very simple nature. Of course we want to have this concept available for both classes and structs. We also want to improve the readability of the resulting code even more. This is where the out-of-the-box story ends and we look at custom solutions.

Many interesting packages to tackle both problems at once are available in the official NuGet repository. This is good enough in the mean time, but many packages are competing and no clear winner has yet emerged. This division can only be solved by Microsoft if they make a similar solution part of the .NET core libraries. Java has – in an unexpected move – done this already in 2014 with version 8 of its standard library. While we wait for that to happen, let us assume that we picked Strilanc.Value.May as our package of choice.

This package contains the struct May of T which is very similar to Nullable of T and also solves both our problems. It works for both classes and structs because there is no artificial where T : struct constraint. As we will see further down, the readability is also improved because it includes helper methods in addition to the classic HasValue and ForceGetValue().

Before we dig into the specifics of May, here is my proposal for a NullReferenceException-free code base.

Step 1: check for null references at the system boundaries

Even if we follow certain guidelines within our code base, we cannot force the rest of the world to do the same. The unfortunate reality is that most C# developers do not use the techniques presented here. So whenever your code interfaces with an external system, perform null checks diligently. Either throw an exception or convert the value to a proper optional type where it makes sense to do so.

Some would argue that we are simply moving the problem but not actually fixing it. In a sense they are correct, but what is important is that the boundary surface is much smaller than the internal system volume.

Step 2: never use null references internally

Within your own code base, you make the rules. One of those rules should be to ban null from appearing anywhere. In practise this means four things:

Never explicitly assign null to any variable, field, etc.

Methods must not return null. Throw an exception or return an optional type instead. Focus on total functions.

Make sure fields and properties are properly initialised before using them.

If a variable is truly optional, annotate it as such with an optional type.

The third one is the toughest, but you should be doing that regardless of your distaste for null. After all, “temporary field” is a well known code smell. If a field is not assigned during a portion of the object life cycle, it probably should not be a field. It is helpful to think of this requirement as a class invariant, and to consequently make sure that all the constructors and setters enforce it. Using immutable objects (another functional programming staple) allows you to even skip the setters. Note that object initialisers undermine this effort, because they do not perform the necessary checks upon initialisation.

By following these four rules, we can simply skip the null checks everywhere in the internal part of our code. Only the few explicit optional values will need special treatment, as we will soon see. It is unfortunate that we can only assume no null references will be passed anymore; the compiler cannot verify this for us. Sites such as uservoice.com show that many C# developers share this frustration with me. “Add non-nullable reference types in C#” has been the number one request in the C# language category for quite some time now. New languages such as Swift (2014) from Apple did this from day one, but it is a lot harder to do in C# because Microsoft cannot afford to introduce breaking changes. From what I understand the C# team is working on this nonetheless, but the solution does not seem to be part of the upcoming C# 7 yet.

Learning Strilanc.Value.May via familiar constructs

We promised that May would increase readability over Nullable, and we still have to make good on that. Meanwhile, we will also show that – because May shares an underlying mathematical structure with certain other types you already know – it is very easy to learn.

For example, in a pinch IEnumerable can be substituted for May. I do not recommend this, but it does provide us with some interesting insights. The idea is to return an enumerable with one element if the value is available, and zero elements if the value is unavailable. First, we need a method to “lift” any type T to IEnumerable of T:

In the case of May, this happens automatically via an implicit cast from T to May of T when needed. No code is the best code! Alternatively, you can explicitly call call the Maybe() extensions method on every T.

Next, we need a construct to signal a lack of value. The empty enumerator will do just fine:

IEnumerable<T> nothing = Enumerable.Empty<T>

Our NuGet package chose another name:

May<T> nothing = May.NoValue;

Furthermore, we can translate the classic May members HasValue and ForceGetValue() to IEnumerable methods Any() and First() like so:

Just like with Nullable, the readability of this code is not great. Additionally, we have to enumerate the enumerable twice which might be costly. On top of that, it is already much less likely but still possible that the developer forgets to call Any() before he executes the First() call – resulting in an exception. Fortunately, there is a better solution to fix all those problems: use the Select method.

If there was no vehicle in the enumerable, there will be no engine in the result. However, if there was exactly one vehicle in the input, the output will contain exactly one engine.

In the Strilanc.Value.May package, the same Select method is provided to extract properties (or other derived values) from an optional value. The only difference in signature is that IEnumerable is swapped with May. This is not a coincidence. Traditionally, a method with such signature is called map, but since LINQ was first and foremost intended to simulate SQL queries, Microsoft borrowed names from that application area instead.

Something interesting happens when we have two layers of optional values. For example, remember that not all persons have a vehicle, and that not all vehicles have a stereo. Using the classic Select approach, we would get the following result:

If the person did not have a vehicle, or the vehicle did not have a stereo, the result would be empty. Else the result is simply the stereo we were looking for. Note that this approach makes it very easy to compose multiple levels of property access together. Using classic if-null checks, this would have resulted in many lines of code, whereas Select and SelectMany calls can be chained without a problem.

The Strilanc.Value.May NuGet package also contains a method with the same signature, but in this case it does not follow the Microsoft naming conventions. Instead, the method is called Bind as it follows the classic functional naming conventions. Another synonym you might find in other packages (or in Java) is flatMap.

To go back from IEnumerable of T to T, use the Any/First combination again. There does not seem to be a more elegant solution out of the box. Our NuGet package on the other hand has the easy to use Else method, where you simply provide an alternative value (such as “N/A”) as input in case there is no value.

Another interesting property of option types is that they delay the choice of the alternative value to a later phase in the specific use case, whereas in the traditional approach it is decided upfront. Whatever that upfront value, it may not be right for all use cases. Perhaps “N/A” is appropriate in one case, while an empty string, a dummy object or even null makes more sense in other cases.

The common mathematical structure is becoming more apparent already. Any transformation T -> Stuff of T that also defines methods with signatures like Lift and SelectMany (and follows some other rules) is called a monad. An explicit Select method is not required, because you can easily build it by combining Lift with SelectMany (try it!). I will not go into further detail, but the takeaway here is that monadic structures typically increase readability and enable composability. Also note that LINQ has been heavily influenced by this and allows us to define “monadic comprehensions”, but that is perhaps a topic best left for a future blog post.

May (and monads in general) are based on a very abstract yet intriguing branch of mathematics called Category Theory. The interested reader is encouraged to research this topic in more detail. Bartosz Milewski is writing a fascinating series of blog posts concerning this topic.

The takeaway of this section is that if you can work fluently with enumerables, you will not have any problem using optional types. The readability is also improved because most simple Select, Bind and Else expressions fit on one line. It can get messy if the Selects become longer or nested in eachother. Simply extracting the lambda projection to a separate method and passing that method to Select almost always solves that problem.

Our final overview table looks like this:

Safe

Readable

Type safe

Naive

X

If guards

X

Safe navigation operator

X

X

Nullable structs

X

+/-

X

May

X

X

X

This post already outlined many issues with the usage of null references, but there is one more I would like to discuss. It is an issue pertaining to semantics. Take the FirstOrDefault LINQ method for example. It returns null under two very different circumstances: when the list is empty or when the first item in the list is null. Similarly, a field or property can be null either because it has not been initialized yet or because it is optional and contains no value. There are very important semantic differences between the two cases and both should be handled differently in code. We cannot make this distinction unless we explicitly encode the semantic meaning – in this case optionalism – in the types. Another way to say this is with an engineering rule of thumb: “absence of a signal should never be used as a signal”.

I like to compare this to how we wrote HTML in the early days vs. nowadays. We used to enclose text in i-tags to italicise it, but this tag did not contain any semantic information. Did we italicise the text because it was a header, because it was an important concept, or for some other reason? These days the W3C encourages the usage of semantic elements such as the em-tag to indicate emphasis. The HTML merely says which function the text has, afterwards CSS is used to define the layout. Perhaps using italics in text to convey emphasis is still a good idea and so not much changes at face value. Either way, future updates to the layout will be much easier to implement. The updated template might call for underlining, bold text or simply a different font colour to convey emphasis. With semantic HTML that only requires the CSS to be changed instead of having to scan through all i-tags in the HTML code one by one and to implement the update on an ad hoc basis.

In my experience, adding more semantic depth to your code – for example with optionals – gives you more insight in the code base. For example, you might come to realise that some null checks were superfluous because the input values could never be null to begin with. Removing those checks makes your code even more readable. On the other hand, there will be some values that you thought could never be null, but turned out to be optional anyway. This makes you actively think about the design of your code. “Does it really make sense that this variable is optional?” “Am I missing something?” In the end you will have more faith in the reliability of the code you refactored, because it does what it does more transparently which in turn makes it easier to understand.

Summary

May of T and similar classes are a great way to deal with optional values without having to rely on null references. Code using these concepts is safe from NullReferenceExceptions, more readable, composable, maintainable and even type safe. The whole NuGet package contains very few items and is also very easy to learn. There is no reason whatsoever to keep making the billion dollar mistake.