Functional Programming in C#

Posted on February 19, 2019

In this article, we will look at some functional programming and data structure concepts, by demonstrating them using the C# programming language. Much of the code is for demonstration only, to help understand a concept. The code itself has some quirks that make it unusable in real code, but many of those quirks can be overcome (a topic for another time). The goal is to provide new tools and ideas to think about programs, which can be utilised in practice where appropriate.

Church-encoding data structures

Church-encoding is introduced here to also introduce some data structures, which will also be used later on, and it is easier (for demonstration) to use a church-encoding in C#. It is also an interesting topic in itself.

booleans

Let’s start simply. Suppose you had to write your own boolean data type.

Essentially, we can replicate all the functionality of a boolean by this method, without information loss in either direction. We say that the Boolean data type is isomorphic to bool, though it is implemented differently.

It might be pointed out that there is a slight difference in the evaluation of the function arguments. For example, the && operation on bool does not evaluate its second argument if the first argument is false. This does not hold for our Boolean data type. We can resolve this discrepancy using Func:

In summary, we can use church-encoding to implement data types. Let’s look at some more data types

optional values

The Optional data type, sometimes called Maybe, has seen a lot of discussion in recent years. It can be thought of intuitively as a list with a maximum length of 1. In other words, it has either 0 elements or it has 1 element.

In practical application, it can be thought of as a replacement for null. For example, instead of your methods returning Bobble which might be null, we instead return a list of 0 or 1 Bobbles.

How might be implement an optional data type? One way is with a church-encoding.

Here we have a data type with 0 or 1 A values, implemented as an abstract function that returns a generic X and that X is arrived at by either the first argument denoting 0 A values, or the second argument, which is a function that is called on the 1 A value.

Let’s write the construction functionality for the two cases Zero and One.

This expression first checks the person value and determines if there are 0 or 1 available, with 0 denoted by null. If there are 0 (it is null), then 0 string values are returned by assigning n to null. However, if there is 1 value available, then the name member is called on that value and assigned to n.

This is similar in functionality to what our Select function does. If we considered implementing our data types this way, the expression would become:

We can think of this function as taking 0 or 1 A values and exactly 1 A value, then returns exactly 1 A value. If there is an A value contained in the first argument, it is returned; otherwise, the second argument is returned.

We could continue writing useful functions for the Optional data type, as it gives potential for a rich API. Let’s move on.

So far we have encoded two data structures:

one of two values as Boolean

a list of either 0 or 1 values as Optional

What about a list of 0 or many values?

lists

What is the church-encoding for lists? There is a mechanical way to calculate it, but we can also intuit it.

A list may be thought of as one of two cases:

empty list, no elements

one element and another list

Using this definition, we can construct any list of values. For example, the list [1,2,3] can be described as:

“one element 1 and another list that is one element 2 and another list that is one element 3 and another list which is empty list.”

To parenthesise this statement to illustrate the grouping:

“one element 1 and another list that is (one element 2 and another list that is (one element 3 and another list which is empty list)).”

We can see that this is a recursively defined data type. That is to say, in the definition of a list, it references itself. This generally implies that the implementation of the recursive case (the second one) will be recursive.

Similar to the Optional data type, we can also write some interesting functions on the List data type. For example, a function that takes every A in a List<A> and converts it to a B using a function, producing a List<B>.

Another way to think of this function can be, take every element in a list (of type A), apply a function that produces a new list (of type List<B>), then append all those lists together.

We can use this function to implement a URL encoder. That is, given a list of characters, if any of those characters need special encoding (e.g. space), we produce a new list of characters for that encoding.

Summary

Let’s summarise.

We have looked at church-encoding for data structures. We first encoded a Boolean data type. Next, we created the Optional data type, which may be thought of as a list with a maximum length of 1. We then encoded a list.

We implemented some interesting functions for these data types, then visited a use-case for List#SelectMany by implementing a URL encoder.

There are some emerging patterns here. In particular, the Optional#Select and List#Select functions are the same in type signature, but for the data type name itself. This can also be said about Optional#SelectMany and List#SelectMany. If we have a look in the base libraries, we will see others, such as IEnumerable and IQueryable. What other things fit this pattern?

Select and SelectMany

Let’s look at another one. But first, let’s more concretely define that pattern.

The Select function is defined on some data type name T such that it implements a function with the type:

We take particular note of the fact that “things that can be T” are things that have exactly one generic argument. For example, Optional and List. We could not use say int which has no generic arguments. It would not make sense to use int because there is no sensible thing that is int<A>.

If we try to use things that have two generic arguments, these also do not fit. However, if we apply one generic to it, then that means we have a thing that takes one more generic.

For example, we have used Func which takes two generic arguments. If we give it one generic argument, we might be able to implement Select and SelectMany. Let us make this argument itself generic. We will call it Q.

Yes, we were able to achieve an implementation! Even more interesting, it is the only possible implementation for this type. That is to say, given this type, the implementation must do the thing that we have just written. This has some caveats; where we have ignored null, recursing forever, the default keyword and a few other nuances. The type signature led us to the implementation in this case.

There are many other candidates that fit this pattern of implementing Select and SelectMany. Millions actually. This pattern has a canonical name.

Things that can implement Select are called covariant functors. The implementation must satisfy a couple of additional constraints to be called this, and our implementations do satisfy it.

Things that can implement both Select and SelectMany are called monads, again with a couple of additional constraints which have been satisfied.

What can we do with them?

LINQ

There are many components to LINQ. We will focus on query operations; specifically those that utilise Select and SelectMany.

Suppose we have two values. Each value is “0 or 1 integers” or in other words, each value has the type Optional<int>. We got them from some other function calls, such as querying a database for the primary key of a record (if that record exists) or from a JSON object given a key (if that key exists). Given these two values, we want to multiply the two integers if they are there. If not, return no int values.

Typically, using null, the code pattern would look something like this:

This syntax corresponds to the use of SelectMany and Select. Actually, in order to use this syntax, we’d need to implement another function, which is redundant, as that function can be implemented in terms of SelectMany and Select.

Did you notice that we repeated the earlier code in multiply for Optional? The only change was that Optional turned into List. We’ll look at that in a moment.

Have you ever passed in the same argument to two different functions, both of which return int, then multiplied the result? The code would look similar to this, perhaps with some variation. However, the general pattern is, “passing in the same value to two different functions, then combining their results using another function (such as multiplication).”

Again, the q value earlier is implicitly threaded through our four functions. We needn’t ever explicitly declare it and pass it through ourselves. This is quite a handy pattern if you find yourself explicitly passing through a value to functions in your program calls to eventually get used. For example, the “program configuration object” may be passed around, and various functions use it where necessary, or pass it on through to more functions. We can instead use SelectMany and Select, or a LINQ query expression.

However, what about all that code repetition?

So far, we have demonstrated multiplying through three seemingly unrelated contexts:

Optional a container of 0 or 1 elements

List a container of 0 or many elements

Func<Q, _> a function that reads a value of type Q to compute its result

This is possible because they each implement their own SelectMany and Select functions. What about other contexts? There are many more that we have not talked about. For example:

continuations

state threading

I/O operations

millions more

And then, the combination of any two or more of each of these, can also implement SelectMany and Select. What if we need to multiply for all of these? What about addition instead? Or perhaps, just any combining operation?

Unfortunately, the C# type system does not give us this ability. We’d need to write an interface to represent, “all things that have SelectMany and Select” but we’d also need to make “a generic that takes one more generic.” We cannot do this for C#. The consequence is that, yes, we must repeat this code for each specific case. Alternatively, we can turn the type system off by using dynamic. Neither of these options are particularly desirable. We have hit the limits of expression of C# in this area.

Nevertheless, we have new tools with which to think about and perhaps design and implement our C# programs. We may use a church-encoding where appropriate, but this has some gotchyas as well. Fortunately, these can be worked around. There are also other interesting tools, data structures and concepts that we can learn about in designing our programs.