Sunday, May 13, 2018

Thoughts On Working With Nested Monad Within The Future Monad In Scala

This post is about nested contexts, specifically with Future, or more accurately, nested monad within the Future monad. eg: Future[Either[E, A]], Future[Option[A]] etc. And how such nested context could easily lead to hard to read and hence hard to maintain codebase.

These sort of context nesting and its impact is not limited to Future, but I would be using Future as the reference in this post, strictly because they seem to be the form that appears the most in a particular Scala codebase I work on at my day job.

Also in this post, I would use Monad and Context interchangeable to mean the same thing. I would also mention terms like Monads, Monad Stack, Monad Transformers, or For Comprehension without providing any explanation of these terms. This is because the thrust of this post is not to expound on these concepts but to share some recent thoughts I have been having regarding nested context within Future.

A reader not familiar with these concepts should still be able to follow along though.

Why We End Up With Nested Context Within Future.

Why do we end up with Future of an Either or Future of an Option or, in a play application Future of a JsResult?

If you are developing in Scala and you are only using the standard library (i.e no Cats or Scalaz), then sooner or later, you would end up having to use/deal with Future.

This is almost inevitable as you would have to perform asynchronous operations like making a remote HTTP call or executing a database query. The library or framework you use for such operations would most likely return the results of these operations in a Future.

And since you also want to make use of the Scala type system to model your program properly, you end up with API’s that encode the result of their asynchronous actions not just as a Future of a plain value, but as a Future of values of type Either, Option etc.

For example, an HTTP request to get a value of type A, can successfully return with that value or fail with an exception E. Using Scala's type system to model our application to be as descriptive as possible, we make use of the Either type to capture this possibility of an error. So instead of having an API that returns a Future[A], we end up with a return type of Future[Either[E, A]]

And there goes our first embedded context within a Future.

My submission in this post is that, often than not, such nested contexts in Future is a code smell and should be avoided at all cost since it leads to a hard to read code. This conclusion is borne out of my experience with a Scala code base at work which is riddled with such nested Future.

The rest of this post makes this case.

Why nesting Contexts within Future is bad

To make this case, let us imagine we have two functions. A getIdByEmail function that retrieves a user ID given an email, and a getPostsById function that retrieves a list of posts given a user ID. These two functions return a value of Future of an Either.

Let us imagine the implementations of these two functions is as follows:

I do not know about you, but code that looks like the above always leaves a bad taste in my mouth.

The combination of nested map/flatMap coupled together with pattern matching does not make for code that is easy to read.

The snippet above is using only just two functions that return nested context in a Future, and yet we are already having code that does not read easily at first glance. What happens if we throw a third or fourth function into the mix? It gets unwieldy really quickly.

So is there anything we can do to improve the readability? What if we try using for comprehension? Since we know it provides syntax sugar for maps and flatMaps, maybe that can help in improving the readability? Let us see.

Our first attempt at using for comprehension does not work out well as intended. The following code:

Again, I do not know about you, but I do not want to write, neither do I want to read code like this just to perform an operation that technically involves two steps: look up an Id by email, use the found Id to lookup associated posts.

If this is the kind code that Functional programming in Scala leads to, then I do not want to have anything to do with it!

The two code snippet above is why I have come to the conclusion that a function with the return type of Future[Either[E, A]], or Future[Option[A]] or something similar is usually a code smell. This is because, as seen above, they lead to unnecessary convoluted and unwieldy code.

How to deal with this? Is there a better way?

I currently can make two propositions for preventing or avoiding this unwieldy code that nested context leads to: Encode the failure in Future or Use Monad Transformers.

Let’s explore these two propositions.

Encode the Failure in Future

The Future type in Scala comes with a mechanism for encoding error within it. Although it is might not be the most robust way, it is definitely an option.

This can be done by calling the failed method with an instance of an exception representing the error. For example:

Future.failed(new UnsupportedOperationException("not implemented"))

This mechanism can then be used to encode exceptional scenario instead of using Either (or Option, or JsResult etc). Leading to a simpler return type of the form Future[A] where A represents the type of the returned value of interests.

This then mean that the functions can be used easily in a for comprehension. The only caveat being that a recover block needs to be introduced to take care of the situation where an exception is encountered.

Applying this approach would see the getIdByEmail and getPostsById method take the following form:

I personally think this approach of encoding the error in the Future results in more readable code.

The only drawback though, is that this approach undermines the type safety since it effectively removes the exception from the type system. It then makes it easier for a user to forget about handling the possible exceptions that can emanate from the Future in a "recover" step, hence leading to things blowing up at runtime.

I am still inclined to adopt this approach than having to deal the cumbersomeness of nested context within Future.

The second approach, which uses Monad transformers is explained next.

Use Monad Transformers

The other approach that helps in dealing with a nested context within Future is the use of monad Transformers.

Types like Future[Either[A, B]] can be more accurately described to be a monad inside another monad or stack of monads. And due to the nature of monads, having a stack of them this way does not lend to convenient usage within a for comprehension.

The way to make them less cumbersome is to make use of Monad Transformers which effectively creates a monad from the combination of two or more monads. The resulting monad can then, for all intent and purpose be treated as a single standalone monad entity.

The Scala standard library gives us monads (types like Future, List, Option, Try etc are all monadic if not monads in the true, law-abiding sense of what a monad is) but does not give us the tools to effectively deal with a stack of monads.

To make use of Monad transformers, we would need to bring in libraries like ScalaZ or Cats.

In this post, I will make use of Cats. The code snippet in the rest of the post below shows how Monad Transformers, has provided by Cats, makes programming with types like Future[Either[E, A]] less cumbersome.

As you can see, this removes all the boilerplate and clunkiness of working with nested Either in a Future. Problem solved!

Conclusion

The Monad Transformer approach is definitely the more robust way to deal with a stack of monads. Which is what a type of Future[Either[E, A]] is. It makes working with them easier without sacrificing type safety.

I definitely recommend using Monad transformers over the other approach of encoding the failure scenarios in the Future itself.

In the case where you cannot use libraries like Scalaz or Cats then I think encoding the exceptional scenarios in the Future would be the way to go, and it would lead to a more readable code when compared to all the nested map/flatMap and pattern matching that would be required otherwise.

So far, these are the two approaches I can think of when it comes to dealing with having another monad nested in the Future monad in Scala. If by chance, you know of any other best practice regarding dealing with a stack of monads in Scala? then please do share in the comment section.👇

7 comments:

First, thanks for working out a great example for code that mixes several monads, and presenting it so clearly! The advantage of concise examples is that it lets us evaluate the alternatives quickly and that can lead to new insights.

I am usually quite skeptical about monad transformers so thought that maybe this example would let me see the light. But in the end I come to a different conclusion than you do. Here's how I would write the task "look up an Id by email, use the found Id to lookup associated posts":

This is exactly as long as your monad transformer code, without any of the boilerplate to get the monad transformers set up.

How did I get there? I took your second solution (the one with the three chained fors, about which you said "If this is the kind code that Functional programming in Scala leads to, then I do not want to have anything to do with it!") and noted simply that your last for clause read

for (books <- ...) yield books

which is just a fancy way to write the identity. So it can simply be dropped.

It's actually quite beautiful because it shows the usefulness of chaining fors in a nutshell. And sometimes grabbing for more complicated abstractions makes us blind for obvious and elegant simpler solutions.

Note: I don't doubt that there is code that can be simplified using monad transformers. Any abstraction has its place somewhere. But I believe that for monad transformers that place is actually quite limited and Scala 3 will introduce better solutions for many of the remaining use cases around effects.

Caveat: If you look at the types, they are actually different. The nested `for` solution gives a `Future[Either[String, Future[...]]`, which is the same as the three-stage for you provide, bit probably not what you want. But the following solution is just as simple, and give the right type:

give us a Future[Either[String, Future[Either[String, List[String]]]]] instead of the required Future[Either[String, List[String]]? I think the example in the article with the chained for comprehensions would have the same problem.

I can't think of another solution that doesn't involve some sort of explicit handling of success/failure between the calls to getIdByEmail and getPostsById (which I suppose is what EitherT is effectively doing for us). It would be interesting to see other possible approaches.

Thanks @Martin for stopping by to chime in and for offering your suggestion. It’s appreciated!

That being said, I can see some issues in the approach you suggested.

I’ll explain:

The first issue is regarding the return type of your suggested approach. As noticed by @jamest in his comment, the return type ends up being: Future[Either[String, Future[Either[String, List[String]]]]]

See https://scastie.scala-lang.org/dadepo/Dyb36sGXS9W8wOzfjFDjEA

That is quite horrendous! By the way, this issue also applies to the more verbose version included in the post.

So even though the "for yield for yield" expression is compact, not as verbose as what is in the post, and might be said to look beautiful, same cannot be said of its output.

This, right here, exemplify some of the main issues I and my team have been having with Scala. That is, how easy it is to create something as unwieldy as a Future[Either[String, Future[Either[String, List[String]]]]]. And before you know it, instead of spending brain cycles on implementing the needed business logic, you wasting time playing type-tetris and trying to wrought values out of these nested contexts.

The second issue regarding your suggestion is about having readable code that clearly reflects the business logic. I explain.

The problem statement in the example can be outlined as:

1. look up an Id by email2. use the found Id to lookup associated posts

The "business logic" using Monad Transformers directly maps to this

for {// look up an Id by email id <- EitherT(getIdByEmail("mee@gmail.com"))// use the found Id to lookup associated posts posts <- EitherT(getPostsById(id))} yield posts

This, in my opinion, makes for code that is clear to read and easy to discuss with colleagues.

This cannot be said of the "for yield for yield" approach you suggested.

// look up an Id by emailfor (idOpt <- getIdByEmail("mee@gmail.com"))yield for (id <- idOpt) // what is this doing here?// use the found Id to lookup associated postsyield getPostsById(id:Int)

What exactly is the second line doing there?

That line exposes the machinery of the embedded monadic context we are dealing with and clutters the ability to crisply express the business logic in code. This kind of clutter would exponentially appear as the intermediate steps in the business logic adds up. In this example we just have two steps, in real life scenario, it is not uncommon to have logic that spans more steps.

These are the two issues I could identify in your suggested approach.

Now to your rebuttal of the Monad Transformer approach. In your criticism, you cited the presence of boilerplate as one of its drawbacks. But I do not see any boilerplate being required in the approach. What was needed was to add a library dependency, and do an import of EitherT. If that is a boilerplate, then almost every single line of Scala code ever written suffers from these kinds of boilerplate.

Yes, the actual machinery for implementing the transformers might require some boilerplate, but the user of the library is shielded from this. The price has already being paid for by the library, saving the user from having to incur this boilerplate cost themselves.

So in conclusion, as it stands, I would still reach out for Monad Transformers. Yes they do come with some performance cost, but I would rather pay for whatever performance hit they bring and have clear, crisp code that easily maps to the business logic, than in my attempt to eschew not using abstractions end up with code that is unwieldy, either due to having to write nested flatMap/Maps/pattern matching, or nested for comprehension that even leads to even more nested return types.

I am continually learning and on the lookout for best practices on how to deal with scenarios like this, so If I find a better approach: which might be based on machinery provided in up-coming Scala 3, or in another abstraction/pattern etc I would definitely update this post.

As we discussed after your talk it is also possible to do the Monad Transformation by hand. True, this must be done for every pair of nested types, but if there are not many of them this is doable. See https://gist.github.com/devlaam/bdce0b64fe3c03254a481309a86dfd43 as example for Future[Option[A]].

I demonstrate a possible other solution that should require less machinery, while still reducing some of the boilerplate. Sadly it doesn't allow usage in a for comprehension though: https://github.com/jheijkoop/nested-monad-helper

We've been trying to tackle this very same problem at work but we got stuck when combining errors using ADTs. Basically, the problem we have is about how Either is covariant while EitherT is not, so that for example if we have an error ADT such as: