std::optional<T>: a T That Might Not Be There

Suppose you’re dealing with functions that might or might not be able to give you an object in return.
With std::optional that’s easy to model:

/// Does a database lookup, returns `std::nullopt` if it wasn't found.
template<typenameT>std::optional<T>lookup(constdatabase&db,std::stringname);/// Calls the function if the condition is `true` and returns the result,
/// `std::nullopt` if the condition was false.
template<typenameT>std::optional<T>call_if(boolcondition,std::function<T()>func);

std::optional<T> means “either a T or nothing”.
In that sense it is like std::variant<T, std::monostate>.
That also means “either a T or nothing”.
Yet std::optional<T> is preferred as it has a more convenient interface.

But note that both just mean “or nothing”.
Not “or not found” or “or function wasn’t called”.
The std::nullopt has no inherent semantic meaning, the meaning is provided by context:

autovalue=lookup<my_type>(db,"foo");if(!value)// optional is empty, this means the value wasn't there
…autoresult=call_if(condition,some_function);if(!result)// optional is empty, this means the condition was false

Here an empty optional means something different depending on the source of that optional.
Just by themselves all std::nullopt’s are equal, context gives them different meaning:

template<typenameT>voidprocess(std::optional<T>value){if(!value)// we don't know *why* the `T` isn't there, it just isn't
}

std::expected<T, E>: a T or an Error

If you want to provide additional information why the T isn’t there, you can use the proposed std::expected<T, E>.
It means “either a T or the error that prevented its existence E”.

The canonical example would be something like this:

/// Opens the file or returns an error code if it was unable to do so.
std::expected<file,std::error_code>open_file(constfs::path&p);

If the function could not return a file, it returns a std::error_code instead.
As such std::expected<T, E> is like std::variant<T, E> — just with a nicer interface and more defined semantics.
std::variant<T, E> just means T or E, std::expected<T, E> gives the E a special meaning.

But something interesting happens when E is an empty type with a single state:

This lookup() implementation also returns a T or nothing if it wasn’t found.
But “nothing” has a well-defined meaning encoded in the type — value_not_found.

This is different from std::optional<T>:
In that case the meaning was only present given the context/origin of the optional.
Now the meaning is encoded into the type itself:

template<typenameT>voidprocess(std::expected<T,value_not_found>value){if(!value)// ah, the `T` wasn't found in the database
}

This is an important distinction as we’ll see later.

Recap: std::optional<T>, std::expected<T, E> and std::variant<T, E>

So to recap:

std::optional<T> is a nicer std::variant<T, std::monostate>

std::expected<T, E> is a nicer std::variant<T, E>

std::nullopt_t and std::monostate are both generic types meaning “empty”, special meaning is only imbued by context

other empty types such as value_not_found are specialised with meaning without any context, just by themselves

std::optional<T> and std::expected<T, std::monostate> both mean the same thing: either a T is there or it isn’t — if it isn’t there is no meaning why

std::expected<T, empty_type> has more semantic meaning than std::optional<T>: the empty_type gives the error more information

Note that I’m making an important assumption here:
std::optional<T> and std::expected<T, E> should be used in the same places.
You’d use std::optional<T> if the reason why you didn’t have the T isn’t important enough,
you’d use std::expected<T, E> if the reason is.
Both types are fine for different APIs.

I repeat the assumption again, because if you don’t agree with that, you won’t agree with the rest of the post:

std::optional<T> and std::expected<T, E> both model the same thing “a T that might not be there.
std::expected just stores additional information why it isn’t there.

There are other situations where you might want to use std::optional<T> but I consider those more or less problematic.
I’ll elaborate in more detail in a follow-up post, for now, just consider the situations where my assumption holds.

You might initially be irritated by the use of the “error” terminology in std::expected<T, E>.
Is it really an “error” if the key isn’t found in some dictionary?

But don’t confuse “error” with “exception”.
It is not some unexpected, fatal problem.
Just some failure to produce a proper value.

Nesting Optional and Expected

Let’s consider our two APIs again:

/// Does a database lookup, returns `std::nullopt` if it wasn't found.
template<typenameT>std::optional<T>lookup(constdatabase&db,std::stringname);/// Calls the function if the condition is `true` and returns the result,
/// `std::nullopt` if the condition was false.
template<typenameT>std::optional<T>call_if(boolcondition,std::function<T()>func);

There are two interesting situations with those APIs.

The first happens when we want to do a database lookup of a value that that might be null in itself.

autoresult=lookup<std::optional<my_type>>(db,name);if(!result)// not found in database
elseif(!result.value())// found in database, but `null`
else{// found and not null
autovalue=result.value().value();}

We end up with a std::optional<std::optional<my_type>>.
If the outer optional is empty that means the value was not stored in the database.
If the inner optional is empty that means the value was stored in the database but it was null.
If both are non-empty the value was stored and non-null.

The second situations happens when we simply combine the two functions:

autolambda=[&]{returnlookup<my_type>(db,name);};autoresult=call_if(condition,lambda);if(!result)// condition was false
elseif(!result.value())// condition was true, but the lookup failed
else{// condition was true and the lookup succeeded
autoactual_value=result.value().value();}

Again, we have a nested optional.
And again it means something different depending on which optional is empty.

But just a std::optional<std::optional<T>> by itself doesn’t have that information!
An empty optional means nothing, an optional containg an empty optional as well.

voidprocess(std::optional<std::optional<my_type>>result){if(!result)// ah, the result was not found in the database
// or the condition was false
// or the value was null?
elseif(!result.value())// was found, but `null`
// or the condition was true but not found?
else…}

Context and now even the order of operations gives it the meaning.

With a std::expected API on the other hand, the information is clear:

voidprocess(std::expected<std::expected<my_type,value_not_found>,func_not_called>result){if(!result)// function wasn't called
elseif(!result.value())// value not found
}

Note that I am not saying that the std::expected API is better:
It is awkward to have call_if() return a std::expected, std::optional is clearly the better choice for that function.
And I’d also argue that lookup() should use std::optional unless there are multiple reasons why a value isn’t there.

I’m merely demonstrating that std::expectedpreserves information about the empty state while std::optional does not.

Flattening Optional and Expected

We hopefully can all agree that both situations are above are not ideal.
Working with nested std::optional or std::expected is weird.

If you want to process a value you would probably do it like so:

autoresult=lookup<std::optional<my_type>>(db,name);if(!result)process(std::nullopt);elseif(!result.value())process(std::nullopt);elseprocess(result.value().value());voidprocess(conststd::optional<my_type>&result){if(!result)// wasn't there — for whatever reason
else// it was there, go further
}

That is, you’d combine the two different empty states of the std::optional into just one.
You flatten the std::optional<std::optional<T>> into a std::optional<T>.

Flattening a std::optional<T> loses information:
We’re squashing two distinct empty states into one.
But without additional contexts the two empty states are the same anyway — a process() called from multiple places can’t distinguish between them.
All it cares about is whether or not it actually has a value.

If it does care about the reason, the std::expected API might be better.

Now we’re passing distinct error information to process() which is actually usable information.
In a sense, that is also a flattening.
But a flattening that preserves information.
Such a preserving flattening needs the context, the meaning of std::nullopt, so it can’t be done in a generic way.

With a combination of std::expected based APIs we can also end up with a nested std::expected<std::expected<T, E1>, E2>.
How would we flatten that?

Well, we either have a T or failed to do so.
When we failed we either failed because of E1 or because of E2.
That is: std::expected<std::expected<T, E1>, E2> flattens to std::expected<T, std::variant<E1, E2>.
This flattening preserves all informations.

Note that if E1 and E2 are empty types, std::variant<E1, E2> is analogous to an error code enum with to possible values.

It is important to point out that this is not actually the flattening from the M-word.

Just for the sake of completeness what happens when we mix std::expected and std::optional?

If we remember that std::optional<T> is std::expected<T, std::monostate>, the flattening rules follow naturally:
std::optional<std::expected<T, E>> is std::expected<T, std::variant<E, std::monostate> is std::expected<T, std::optional<E>>.
And std::expected<std::optional<T>, E> is std::expected<std::expected<T, std::monostate>, E> is std::expected<T, std::optional<E>>.

If you think about them, this makes sense.
In both cases we have three states: a T, a failure to do so because of E or a failure to do so because of generic reasons.

You might argue that we’re losing information because the generic failure happens in a different order,
but that isn’t really usable information anyway.
It is just a “generic failure”.

We know that the std::expected flattening rules are well-formed because std::optional<std::optional<T>> is std::expected<std::expected<T, std::monostate>, std::monostate> is std::expected<T, std::variant<std::monostate, std::monostate>> is std::expected<T, std::monostate> is std::optional<T>.
The optional flattening rules simply follow!

std::optional<std::optional<T>> flattens to std::optional<T>, losing some information, but that information wasn’t really there in the first place

other flattening rules follow from treating std::optional<T> as std::expected<T, std::monostate>

You Don’t Want Nested Optionals or Expecteds

Dealing with nested optionals and expected is awkward, you have to check multiple layers, write .value().value().value() etc.
So in real code you would avoid them: as soon as you have them, you’d flatten them, possibly manual.

And again, flattening nested optionals does not lose you any usable information by itself.
The empty states only gain semantic meaning from context.
If the context isn’t there, they’re equivalent.

So if you are writing a user-facing, high-level API you would never return a nested optional or expected on purpose!

Just looking at it, this API doesn’t return a nested optional.
But as we’ve seen a nested optional appears if T is an optional itself.

Yet this API has done nothing wrong.
For its intents and purposes, T is just some opaque generic type.
It doesn’t really concern itself with the exact details.
All generic code using that API will never realize that it is in fact a nested optional, it just deals with a std::optional<T> where T is “something”.

Only the final user that explicitly passed a std::optional<T> to it will end up with a nested optional.
But the API itself didn’t create on “on purpose”, it happened “accidentally”, so to speak.

Once you write std::optional<std::optional<T>> you should flatten it.
If you just write std::optional<U> where Umight be a std::optional<T> but you don’t care, you’re good.

Automatic Flattening?

So when we immediately flatten nested optionals once we got them, why not do that automatically?
Why not make std::optional<std::optional<T>> and std::optional<T> the same type?

I proposed that on twitter without thinking too much of the consequences and without this 2800 word essay to back up my justifications,
so it just seemed harmful and weird to do.

Of course a std::optional<std::optional<T>> and std::optional<T> are different things:
One is a T that might not be there, the other is a std::optional<T> that might not be there.
But as I’ve might have convinced you, the distinction — without any context — isn’t really usable.
Both just model a T that might not be there.

So I think I’m justified in wanting to do that, but sadly it is still impractical.

Implementation for std::optional or std::expected is left as an exercise for the reader.
Note that for std::expected there are two implementations: one on the value and one on the error.
And the flatten I’ve described doesn’t really match the flatten expected here (no pun intended).

Note that the map() and and_then() are really similar.
In one case the function transforms every element individually, yielding a single element.
In the other case the function transforms every element into a container again.

You can even implement and_then() by calling map() and then flatten() it.

And clearly for std::vector there is a huge difference between a std::vector<T> and std::vector<std::vector<T>>.

But for std::optional?

I’ve argued, not really.
Yet still you’d have to think about which one you do:

The first lambda returns an int, so you use map().
The second returns a std::optional<int>, so you use and_then().
If you accidentally use map() you have a std::optional<std::optional<int>>.

Thinking about that distinction is annoying:
Composing optionals is awkward enough already in C++, such differences shouldn’t matter.

A single function should just do the right thing, no matter what you throw at it.

Yes, this is mathematically impure and doesn’t really implement a monad for std::optional.
But C++ isn’t category theory, it’s fine to be pragmatic.
You wouldn’t really have templates taking “monads” anyway, while they are mathematically similar, the actual usages and performance differences are too different.

Not that I am not saying that monads should automatically flatten in general.
Just std::optional.

And you can still have the proper monadic functions if you want.
They just shouldn’t be the default.

Similarly, composing multiple functions returning expected’s should flatten in a similar way.
You wouldn’t want a nested std::expected, you want a single std::expected combining all errors.

Note that this automatic flattening on composition has precedent:
Rust’s expected, Result<T, E> will flatten in a similar way to what I’ve described.
If you’re composing functions returning Result<T, E1> in a function returning Result<T, E2>,
they will be automatically converted.

Conclusion

The empty state of std::optional<T> does not have any inherent meaning.
It just means “empty”.
Only the origin gives it meaning such as “not found”.

As such a std::optional<std::optional<T>> only means T or empty or really empty.
Without additional context that is the same as std::optional<T>.
Flattening a nested optional does lose information, but not usable information.

If you want to give special meaning to the empty state use std::expected<T, E> where E is that special meaning.
Flattening a nested expected preserves all information.

As working with nested optionals or expecteds is awkward, they want to be flattened.
Flattening automatically every time breaks in generic code, but flattening on composition is a bit mathematically impure, but works.

With that information we can also answer the comparison problem outlined in Barry’s blog post.
What should f6(std::nullopt, std::nullopt) return?

As std::nullopt doesn’t have any special meaning on its own, all instances are equal.
It does not matter how many nested optionals we have.

This post was made possible by my Patreon supporters.
If you'd like to support me as well, please head over to my Patreon and do so!
One dollar per month can make all the difference.