Last year I stumbled across a simple representation for partial information about values, and wrote about it in two posts, A type for
partial values and Implementing a type for partial values.
Of particular interest is the ability to combine two partial values into one, combining the information present in each one.

This post combines these two ideas.
It describes how to work with partial values in Haskell natively, i.e., without using any special representation and without the use
restrictions of unambiguous choice.
I got inspired to try removing those restrictions during stimulating discussions with Thomas Davie, Russell O’Connor others in the #haskell gang.

You can download and play with the library shown described here.
There are links and a bit more info on the lub wiki page.

Information, more or less

The meanings of programming languages are often defined via a technique called “denotational semantics”.
In this style, one specifies a mathematical model, or semantic domain, for the meanings of utterances in the and then writes what looks
like a recursive functional program that maps from syntax to semantic domain.
That “program” defines the semantic function.
Really, there’s one domain and one semantic function each syntactic category.
In typed languages like Haskell, every type has an associated semantic domain.

One of the clever ideas of the theory of semantic domains (“domain theory”) is to place a partial ordering on values (domain members),
based on information content.
Values can not only be equal and unequal, they can also have more or less information content than each other.
The value with the least information is at the bottom of this ordering, and so is called “bottom”, often written as “⊥”.
In Haskell, ⊥ is the meaning of “undefined“.
For succinctness below, I’ll write “⊥” instead of “undefined” in Haskell code.

Many types have corresponding flat domains, meaning that the only values are either completely undefined completely defined.
For instance, the Haskell type Integer is (semantically) flat.
Its values are all either ⊥ or integers.

Structured types are not flat.
For instance, the meaning of (i.e., the domain corresponding to) the Haskell type (Bool,Integer) contains five different kinds of
values, as shown in the figure.
Each arrow leads from a less-defined (less informative) value to a more-defined value.

To handle the diversity of Haskell types, define a class of types for which we know how to compute lubs.

Almost

We can fix the too-lazy version by checking that one of the arguments is non-bottom, which is what seq does.
Which one to we check? The one that isn’t ⊥, or either one if they’re both defined.
Our friend unamb can manage this task:

Functions

A function f is said to be less (or equally) defined than a function g when f is less (or equally) defined g for every argument
value.
Consequently,

instance HasLub b => HasLub (a -> b) where
f ⊔ g = a -> f a ⊔ g a

More succinctly:

instance HasLub b => HasLub (a -> b) where (⊔) = liftA2 (⊔)

Other types

We’ve already handled the unit type (), other flat types, pairs, sums, and functions.
Algebraic data types can be modeled via this standard set, with a technique from generic programming.
Define methods that map to and from a type in the standard set.

However, this rule would overlap with all other HasLub instances, because Haskell instance selection is based only on the head of an instance definition, i.e., the part after the “=>“.
Instead, we’ll define a HasLub instance per HasRepr instance.

[…] It took me several days of doodling, pacing outside, and talking to myself before the idea for unamb broke through. Like many of my favorite ideas, it’s simple and obvious in retrospect: to remove the ambiguity of nondeterministic choice (as in the amb operator), restrict its use to values that are equal when non-bottom. Whenever we have two different methods of answering the same question (or possibly failing), we can use unamb to try them both. Failures (errors or non-termination) are no problem in this context. A more powerful variation on unamb is the least upper bound operator lub, as described in Merging partial values. […]

Your use of unamb in defining if' does not meet the required precondition (of information-compatible arguments). From a conversation on #haskell, I know you’ve come up with a really beautiful correct definition. I’d love to see you post the correct version for all to admire.

[…] In a strict language, where there are only two boolean values, these two clauses have a straightforward reading. (The reading is less straightforward when patterns overlap, as mentioned in Lazier function definitions by merging partial values.) In a non-strict language like Haskell, there are three distinct boolean values, not two. Besides True and False, Bool also has a value ⊥, pronounced “bottom” for being at the bottom of the information ordering. For an illustration and explanation of information ordering, see Merging partial values. […]