The Expression Problem in Rust

Oct 30, 2018

Eli Bendersky’s website is fantastic and you should read it. In one
series of posts he discusses the expression problem: a
software design problem where in some languages it is easy to add new data types
and in others it is easy to add new operations on data types. I like thinking
about the expression problem because similar concepts appear in the design of
so-called “deep learning” frameworks like MXNet and Tensorflow where operations
and data types are abstracted out and treated symbolically.

To summarize his excellent post, in object-oriented languages it is easy to add
new data types. However, to add a new operation you need to modify the object
interface and revisit each data type definition to implement the new operation.
(The key word here, I believe, is revisit in the sense that the modification
violates the open-closed principle.) Alternately, in functional
programming languages the situation is reversed: adding a new data type means
having to modify the definition of each operation, also violating the
open-closed principle.

I believe that Rust solves this problem using traits. In this post I
will demonstrate how this is so. My knowledge of Rust is really elementary at
the time of writing this so hopefully I haven’t violated any major Rust
principles. My code may not be of top-level Rust design but it works.

A Motivating Example

Let’s replicate Eli’s example in Rust: a simple expression evaluator. Although
he brings up this example in the context of compilers and interpreters
expression construction and evaluation is also the bread and butter of deep
learning frameworks.

Eli’s examples use a base class (in OOP) or enum (in FP) called Expr from
which all expression types are derived. From my reading of Rust such a notion
may not be necessary, or perhaps even appropriate, if the goal is to perform
operations on data types. Therefore, in my definition of Constant I will omit
any “inheritance” from a parent type. See my final thoughts at the end of this
post for more about design decision.

We begin with the definition of a Constant data type,

#[derive(Debug)]structConstant<T>{value:T}

We include the derive(Debug) macro to auto-magically make the data type
printable for debugging purposes. Below is some code using this struct,

Let’s now make Constant expressions “evaluatable”. To do so, we’ll create a
new trait Eval which implements an eval() method. For Constant expressions
eval() should simply return a copy of stored value. Note that we could
implement eval() as a struct method but the end goal, here, is to have shared
operation functionality across multiple data types.

Because of Rust’s ownership model we return a copy/clone of Constant.value so
that the struct can retain ownership of its value. Otherwise, the contents of
value are moved out of the Constant struct to the caller. (Aside: is there
a better way to do this?) Let’s test that (a) the eval() implementation works
and (b) that the ownership properties are indeed satisfied.

Adding a New Type

Again, I am not yet confident in my Rust skills but my understanding is that the
BinaryPlus type should not actually own the data stored in its left and
right attributes. In particular, we want the lifetime of BinaryPlus’s left
attribute to last at least as long if not longer than the BinaryPlus instance,
itself. Hence, the lifetime specifications. Regardless, trying to compile this
data type without lifetime parameters results in a compile error.

Note the requirements on the generic data types. Whatever type T is must be
add-able with output type also T. The left and right arguments must be
Eval-able; after all, we call eval() on them. What’s interesting is that
these restrictions are not present in the actual definition of the BinaryPlus
type, above. Only if you want to use the Eval functionality do you have to
narrow down the traits of the arguments.

Note that in adding a new type we did not have to modify and previous
definitions of the Eval trait. Instead, we just specified how the Eval trait
behaves with this new type. Therefore the FP version of the expression problem
is resolved; we haven’t violated the open-closed principle.

Adding a New Operation

Time to add the second example operation to_string() as a method of the trait
Stringify and implement it for both the Constant and BinaryPlus data
types.

Nice. Again we see that adding the Stringify trait does not require
modifications to the previously-written code for the Constant and BinaryPlus
data types. Instead, we modify the capabilities of these types with new trait
implementation code. Therefore, Rust also resolves the OOP version of the
expression problem.

I suppose the caveat, here, is if you wanted to modify the definition of trait
Stringify to include a new method. Here is one way to do it without having to
return to the definition of Stringify,

Once again, no existing code was modified. I added the “extension trait” and
implemented it for any type which implements Stringify. This is not quite the
same pattern as I read in this article but, then again,
Option<T> is a type (enum), not a trait. Regardless, this technique is useful
if you want to extend traits that don’t belong to you.

Final Thoughts

Learning Rust has been educational. Just like when I played around with Haskell
I came away looking at code design a little bit differently in other languages.
(My post-Haskell Python code used more functional than iterative styles.)

Earlier I mentioned that using an Expr enum seemed unnecessary for modeling
the expression problem in Rust. Strictly speaking, to accurately model the
problem I should have created an enum like so,

enumExpr{ConstantExpr(Constant),BinaryPlusExpr(BinaryPlus),}

The user could then create ConstantExpr and BinaryPlusExpr instances, both
of which are of type Expr. The implementation details would be “hidden”
underneath in the actual underlying structs, Constant and BinaryPlus,
respectively. My thought is the use of an Expr enum does not add anything to
resolving the expression problem: the only goal is to have a model where we can
add data types and operations on those data types independently and easily such
that the open-closed principle isn’t violated.

That being said, Rust enums cannot be extended and therefore the open-closed
principle is violated. However, there is a clunky workaround.

The role of the Expr interface in object-oriented programming languages is to
establish a shared interface across all sub-types. The role of the Expr enum
in functional programming languages is so that you can perform matching on
various sub-types. With Rust traits the data types and operations on those data
types are much more separated since you can just check if an input type
implements a trait. Therefore, for the purposes of this problem I’m comfortable
in omitting the Expr enum and calling the problem solved. Please prove me
wrong because it seems too good to be true.

This website once was my online research journal while I was working on my Ph.D. Now it is an outlet for interesting mathematics- and programming-related ideas and musings. I like to share things I find useful.