Today I hope to start a new series of posts exploring constructive abstract algebra in Haskell.

In particular, I want to talk about a novel encoding of linear functionals, polynomials and linear maps in Haskell, but first we're going to have to build up some common terminology.

Having obtained the blessing of Wolfgang Jeltsch, I replaced the algebra package on hackage with something... bigger, although still very much a work in progress.

(Infinite) Modules over Semirings

Recall that a vector space V over a field F is given by an additive Abelian group on V, and a scalar multiplication operator(.*) :: F -> V -> V subject to distributivity laws

s .* (u + v) = s .* u + s .* v
(s + t) .* v = s .* v + t .* v

and associativity laws

(s * t) .* v = s .* (t .* v)

and respect of the unit of the field.

1 .* v = v

Since multiplication on a field is commutative, we can also add

(*.) :: V -> F -> V
v *. f = f .* v

with analogous rules.

But when F is only a Ring, we call the analogous structure a module, and in a ring, we can't rely on the commutativity of multiplication, so we may have to deal left-modules and right-modules, where only one of those products is available.

We can weaken the structure still further. If we lose the negation in our Ring we and go to a Rig (often called a Semiring), now our module is an additive moniod.

If we get rid of the additive and multiplicative unit on our Rig we get down to what some authors call a Ringoid, but which we'll call a Semiring here, because it makes the connection between semiring and semigroup clearer, and the -oid suffix is dangerously overloaded due to category theory.

First we'll define additive semigroups, because I'm going to need both additive and multiplicative monoids over the same types, and Data.Monoid has simultaneously too much and too little structure.

We could go one step further and define multiplication pointwise, but while the direct product of |e| copies of a ring _does_ define a ring, and this ring is the one provided by the Conal Elliot's vector-space package, it isn't the most general ring we could construct. But we'll need to take a detour first.

Linear Functionals

A Linear functional f on a module M is a linear function from a M to its scalars R.

That is to say that, f : M -> R such that

f (a .* x + y) = a * f x + f y

Consequently linear functionals also form a module over R. We call this module the dual module M*.

Dan Piponi has blogged about these dual vectors (or covectors) in the context of trace diagrams.

If we limit our discussion to free modules, then M = E -> R, so a linear functional on M looks like (E -> R) -> Rsubject to additional linearity constraints on the result arrow.

The main thing we're not allowed to do in our function is apply our function from E -> R to two different E's and then multiply the results together. Our pointwise definitions above satisfy those linearity constraints, but for example:

bad f = f 0 * f 0

does not.

We could capture this invariant in the type by saying that instead we want

we'd have to make a new such type every time we subclassed Semiring. I'll leave further exploration of this more exotic type to another time. (Using some technically illegal module instances we can recover more structure that you'd expect.)

Now we can package up the type of covectors/linear functionals:

infixr0 $*
newtype Linear r a = Linear {($*) :: (a -> r) -> r }

The sufficiently observant may have already noticed that this type is the same as the Cont monad (subject to the linearity restriction on the result arrow).

In fact the Functor, Monad, Applicative instances for Cont all carry over, and preserve linearity.

(We lose callCC, but that is at least partially due to the fact that callCC has a less than ideal type signature.)

In addition we get a number of additional instances for Alternative, MonadPlus, by exploiting the knowledge that r is ring-like:

This has to do with the fact that when you change basis, you change map the change over covariant vectors covariantly, and map the change over vectors contravariantly. (This distinction is beautifully captured by Einstein's summation notation.)

Functions vary covariantly in their result, and contravariant in their argument. E -> R is contravariant in E. But we chose this representation for our free modules, so the vectors in our free vector space (or module) are contravariant in E.

We have lots of classes for manipulating covariant structures, and most of them apply to both (Linear r) and (Map r b).

Other Representations and Design Trade-offs

One common representation of vectors in a free vector space is as some kind of normalized list of scalars and basis vectors. In particular, David Amos's wonderful HaskellForMaths uses

newtype Vect r a = Vect { runVect :: [(r,a)]}

for free vector spaces, only considering them up to linearity, paying for normalization as it goes.

Given the insight above we can see that Vect isn't a representation of vectors in the free vector space, but instead represents the covectors of that space, quite simply because Vect r a varies covariantly with change of basis!

Now the price of using the Monad on Vect r is that the monad denormalizes the representation. In particular, you can have multiple copies of the same basis vector., so any function that uses Vect r a has to merge them together.

On the other hand with the directly encoded linear functionals we've described here, we've placed no obligations on the consumer of a linear functional. They can feed the directly encoded linear functional any vector they want!

To wax categorical for a moment, we can construct a forgetful functor U : Vect_F -> Set that takes a vector space over F to just its set of covectors.

F E = (E -> F, F,\f g x -> f x + g x ,\r f x -> r * f x)

using the pointwise constructions we built earlier.

Then in a classical setting, you can show that F is left adjoint to U.

In particular the witnesses of this adjunction provide the linear map from (E -> F) to V and the function E -> (V ~> F) giving a linear functional on V for each element of E.

In a classical setting you can go a lot farther, and show that all vector spaces (but not all modules) are free.

But in a constructive setting, such as Haskell, we need a fair bit to go back and forth, in particular we wind up need E to be finitely enumerable to go one way, and for it to have decidable equality to go in the other. The latter is fairly easy to see, because even going from E -> (E -> F) requires that we can define and partially apply something like Kronecker's delta:

You can still probe Linear r a for individual coefficients, or pass it a vector for polynomial evaluation very easily, but for instance determining a degree of a polynomial efficiently requires attaching more structure to your semiring, because the only value you can get out of Linear r a is an r.

Optimizing Linear Functionals

In both the Vect r and Linear r cases, excessive use of (>>=) without somehow normalizing or tabulating your data will cause a lot of repeated work.

This is perhaps easiest to see from the fact that Vect r never used the addition of r, so it distributed everything into a kind of disjunctive normal form. Linear r does the same thing.

If you look at the Kleisli arrows of Vect r or Linear r as linear mappings, then you can see that Kleisli composition is going to explode the number of terms.

So how can we collapse back down?

In the Kleisli (Vect r) case we usually build up a map as we walk through the list then spit the list back out in order having added up like terms.

In the Map r case, we can do better. My representable-tries package provides a readily instantiable HasTrie class, and the method:

memo :: HasTrie a => (a -> r) -> a -> r

which is responsible for providing a memoized version of the function from a -> r in a purely functional way. This is obviously a linear map!

Without commutative addition, you don’t get many normalization opportunities. So, in the interest of drawing the line somewhere I left that off. ;) I also avoided talking about left-seminnearings and right-seminearrings. =) I do enjoy playing with the individual grains of sand in my sandbox though.

You say “If we get rid of the additive and multiplicative unit on our Rig we get down to what some authors call a Ringoid, but which we’ll call a Semiring”, but earlier you “…go to a Rig (often called a Semiring)”. Are both of them Semirings?

This is what I get for trying to nod to the fact that different authors use the same words for different things.

The vocabulary I’m working with here is that a semiring is a pair of semigroups with a pair of distributive laws and that a rig adds a 0 and 1 to a semiring.

Some authors call what I am calling a rig above a semiring, which is somewhat annoying because the semi- just means ‘not-quite’ in this setting, and has no connection to the relationship between group and semigroup. Moreover it leaves the question of what to call the pair-of-semigroups construction that I’m calling semiring above. The name those authors often use is “ringoid”, but that conflicts with the much more useful use of ringoid for referring to Ab-enriched categories, which is likely to wind up in my semigroupoid package.

If a group is a groupoid with one object then a ringoid is a ringoid with one object.

I suppose that “replicate1p” and “pow1p” both define repetition. IMHO you can name them with “+” and “*”, adding some symbol to denote repetition, maybe “…”? (And with the right order of arguments in “pow1p”. :) )

@beroal: i didn’t make them symbolic because they have the problem that what do you name the three different operators for ((N+1)*), (N*) and (Z*) ?

whereas the precedent of log1p at least gives some intuition to the name replicate1p, and the action of replicate on lists if the repetition of a monoidal value. I had, however originally started using (#*) and (*#) for times and may revert. They are mostly provided as reasonable default implementations for the natural number multiplication and integer multiplication requirements placed on you by the module superclasses.

As for the free module, technically I’m permitting the infinite free module. Haskell provides me a largely coinductive universe, I’d be remiss in not taking advantage of it. I’ll throw an (infinite) in there. ;) But unlike the classical case I can’t prove the duality in most cases to begin with and I’m not able to enumerate the set of all e in E such that E -> R is non-zero except when the set E is finitely enumerable itself, because I can’t intensionally inspect the function I’m given. Effectively the constraint moves from requiring the function to have a finite number of non-zeros to requiring that any linear functional can only inspect a finite number of vectors, which being constructive is all it can do anyways.

The invariant in question is linearity. One (potentially conservative) way to enforce that would be to never apply our continuation vector twice and take the product of the results (or add it to some value in the ring), but instead only multiply it by values we have lying around in the ring or add and subtract them. If you think about it, quantifying over the choice of a module then prevents violations of linearity. \k -> r .* f k is well typed, we can plumb in zeroes, etc, but \k -> f k * f k requires our module be strengthened to a semiring, and the unmentioned but also non-linear \k -> f k + r also would fail to type check, since f k :: m, and under quantification over the module all we’re permitted to do is use the (.*) and the (+) from our module. the reason this is a problem is if you want to use the zero from a semigroup-with-zero module, you’d need a new type.

There is also a performance impact from the fact that we effectively use the LeftModule dictionary as an interpreter, and most damningly it costs us the ability to use representable-tries for cheap memoization.