Lane's Blog » HaskellAn Acceptable Blog2015-03-03T20:16:20Zhttp://blog.downstairspeople.org/feed/atom/WordPress.comChristopher Lane Hinsonhttp://clanehin.wordpress.comhttp://clanehin.wordpress.com/?p=2932011-03-13T21:59:09Z2011-03-16T08:00:21Z]]>After about five years programming in Haskell, I think we need a rule: Only put one function in a typeclass.

Why? Because inevitably someone comes along with a data type for which one or the other function of a typeclass is perfectly suited, and yet another function of the same typeclass is not implementable.

All of the abstract ways to construct Nothing: fail (Monad), mempty (Monoid), mzero (MonadPlus), empty (Alternative). Not surprisingly, all of these typeclasses are subtly related.

The natural numbers, which have a minBound, but not a maxBound (Bounded) . . .

. . . and which support addition and multiplication, but aren’t closed under subtraction and for which the concept of a sign does not exist (Num).

My own memoizable message type, which would like to implement Applicative, but needs a monadic computation to implement pure. (Applicative)

It’s a little extra typing to write multiple “class . . . where” clauses for each type that needs to implement a large number of type-indexed functions, but it’s quite easy to combine related typeclasses when appropriate, as follows:

In conclusion, you should definitely follow this rule if I have convinced you that it is a good idea to follow it.

]]>5Christopher Lane Hinsonhttp://clanehin.wordpress.comhttp://blog.downstairspeople.org/?p=1702010-10-10T17:09:12Z2010-07-17T21:06:45Z]]>In reactive programming we can choose between two models: “pull,” in which we run a computation each time output is required, and “push,” in which we run the computation each time input arrives.

Which model we use depends on whether we are working with high-frequency or low-frequency data. If we are writing a piece of avionics software that measures pitch, yaw, and roll, then we need to constantly adjust the plane’s aerodynamic surfaces based on those variables. We don’t need a notification when these variables change, because they change constantly. The pull model would be perfect in this case.

On the other hand, engine temperature is every bit as critical to the health of the vehicle, but presumably that variable remains in equilibrium for long stretches of time, and small variations aren’t important. We don’t want to waste CPU time monitoring temperature 100 cycles per second. We might simply want to receive a notification whenever the engine temperature changes by 1 degree or more. The push model works better here.

The problem: How do I efficiently embed a low-frequency signal in a high-frequency channel? If I pass the low-frequency signal naively, it will work, but entail much redundant computation.

When we want to avoid recomputing a value, we often use a memoization strategy. However, in this case we need to memoize a data stream, not a function.

In the engine temperature example, it would be easy to memoize a function of type Int -> a. But we want to compose this function as part of a signal. After all, if the engine temperature is low frequency, then so is any signal derived from the engine temperature. The chain of transformations should be memoized along its entire length. Further, this function risks artificially escalating a meaningless low-amplitude high-frequency component of an otherwise low frequency signal by imposing an arbitrary boundary: suppose that some engine vibration causes the temperature to oscillate rapidly between 198.9 and 199.05 degrees, which would truncate to 198 and 199? This does not yield the notification heuristic we are looking for.

The solution: Tag information with a unique signature at its point of departure and then memoize it at the point of arrival. Transformations of the data stream also need to be tagged. A source signature is either a unique integer, or an annotation of applying one signature to another. There is some overhead associated with comparing signatures, but this overhead can not be greater than the cost of performing the underlying operations.

Memoizable messages are very similar to applicative functors. They can not, however, implement the Control.Applicative interface, because any pure constructor would be unsigned and therefore destroy memoization.

This memoization scheme requires three operations:

Transmit: Sign a message with a unique signature, indicating its source. If a subsequent signal is sufficiently similar, reuse the same signature.

Receive: Unpack a signal, memoizing against the signature of the previous input.

Imagine being killed by a bow and arrow. That would suck, an arrow killed you? They would never solve the crime. "Look at that dead guy. Let’s go that way." — Mitch Hedberg

I seem to be one of the few people who absolutely adores arrows. I thought it might be helpful if I provided some insight to the advanced-level newbies regarding the practical use of arrows in Haskell. Plenty has already been written about what arrows are in category theory, how to implement the Arrow typeclass, and how to use the special arrow syntax. I want to talk about why I occasionally wake up in the morning thinking, "Maybe I could solve this problem using arrows!"

This article is just to share an extremely simple, intuitive and concrete example of an arrow. I’m not going to get into all the crazy amazing things arrows can do. I just want to show that they can do at least one cool useful thing.

Arrows are closely related to monads. For example, both arrows and monads can be used to capture side-effecting operations.

Monads have the kind * -> *, indicating a side-effecting operation with a single output type, and a binding operation m a -> (a -> m b) -> m b. The input to a monadic operation is provided via a pure function. Arrows, on the other hand, have a kind * -> * -> *, indicating a side-effecting operation with a single input type and a single output type, and a binding operation a b c -> a c d -> a b d.

When binding monads, it’s obvious from the type signature that the subsequent side-effecting operation does not even exist until after the output of the previous operation becomes available. It cannot, because it depends on the previous output and because we can inject any pure function to construct the subsequent operation based on completely arbitrary criteria.

All monads are arrows, but not all arrows are monads, and this observation is pertinent to their implementation in Haskell, but I’m going to restrict this article’s discussion to "interesting arrows," specifically, arrows that aren’t monads because they don’t implement ArrowApply. Interesting arrows are basically monads without flow control: you can’t generally choose what side-effecting actions to perform based on things you learn during the execution of the arrow.

This is why arrow-notation creates two scopes. Between the <- -< symbols, only values that were in scope before execution of the Arrow are in scope. Outside the <- -<, values that appear during the execution of the Arrow are also in scope.

For example, we (you and I) might have a monad that allows us to perform certain dangerous operations, like overwriting files. In a monadic context, we can not anticipate what any particular instance of the monad will choose to do. We might write a very complicated installation script that accesses many files. Do we have permission to write to all of those files? Do the files we want to read even exist? Do we access an infinite number of files?

Do we ever write to /dev/nuclear_missiles?

We would like to know the answers to these questions before running the installation script, otherwise we could be interrupted and leave the system in a chaotic state. Even if we could recover from an error, we would still be wasting the user’s time, which is the opposite of the thing that computers are for.

But if we implemented our IO environment using an arrow, we could anticipate all of the side-effecting operations, even have a list of files to be overwritten before the operation begins.

In our new file IO arrow, it will be impossible to read the name of a file from a file, and then write to that file dynamically, because all file names must be specified at the time the arrow is bound. That’s a pretty onerous restriction, but we can always add new operations later, if we need them.

Our arrow needs a list of accessed files and an IO action. The list of file paths is going to take the form of a monoid, while the sequence of IO actions will take the form of a Kleisli arrow.

data IORWA a b = IORWA [FilePath] (a -> IO b)

We need a category instance.

instance Category IORWA where

Implementing id is easy. Id accesses no files, so we give an empty file list. Return is the simplest monadic action that type checks.

id = IORWA [] return

The bind operation requires that we concatenate two lists of file paths, and bind the IO actions. (This is a little annoying, but note that the (.) operator specifies the preceding action second and the subsequent action first.)

And we need read/write operations, in which we simply pack the file path parameter into the file list. Notice that we take the file path as a static parameter, but we take the data to write as an input to the arrow.

writeFileA :: FilePath -> IORWA String ()

writeFileA path = IORWA [path] $ \s -> writeFile path s

readFileA :: FilePath -> IORWA () String

readFileA path = IORWA [path] $ \_ -> readFile path

Using our arrow is as simple as exporting a function to the accessed file list and the IO action, as long as we refuse to export any way to corrupt the synchronization between the two fields.

We could implement ArrowChoice. This would allow us to choose at runtime between accessing two different sets of files. Both possibilities would appear in our static accessible file list, but only one would actually be accessed.

We could use a modified ReaderArrow to capture rewriting rules for file paths, e.g., to specify a current working directory. We can’t use ReaderArrow directly, because it would route information through the monadic component of the computation.

We could use a WriterArrow to retain a log of all of the data we actually write.

We could use an ErrorArrow to recover from file system errors.

We could implement ArrowLoop based on the MonadFix instance of IO.

We could use the Automaton arrow to implement multi-phase read/write cycles. Perhaps the first phase would be read-only, then we could check the file list again before proceeding to the second phase.

We could re-implement what we just wrote in terms of the StaticArrow and Kleisli arrows,
and get a metric ton of the above for free.