Notes (HPFP 02/31): Hello Haskell

2 Hello Haskell!

2.1 Hello, Haskell

GHC: The Glasgow Haskell Compiler is a program (written in Haskell and C) turns Haskell code into x86 or ARM executables. There are many other Haskell compilers, but GHC is the de facto standard. Some Haskell compilers, such as Hugs, UHC and Yhc are no longer actively developed, but are notable for historic reasons, others, such as ghcjs and Frege (technically a separate dialect) are also important, since they target different platforms than GHC (Javascript and the JVM, respectively).

GHCi: GHC’s interactive mode, or REPL. A REPL, or read-eval-print-loop, reads code that you type into a command line shell, evaluates it, prints the results, and then loops so you can type in more code.

Cabal: Haskell’s Common Architecture for Building Applications and Libraries. When we write software, we often want to include external libraries, Cabal is a tool that enables us to easily include those libraries in our own software, as well as to publish libraries that can be used by others.

Hackage: A central package archive which contains most of the Haskell libraries you’re ever going to want to use. Cabal installs packages from Hackage by default.

Cabal hell: When we install external libraries using cabal, those libraries are installed globally on our machine by default. This means that if we want to build two different projects that depend upon the same external library, the naive way to do so is to install that library on our machine with cabal and then include it in our projects. This seems innocuous, but is actually very bad, because it creates a hidden interdependence between our two projects.

Suppose the library author added some new features in the library that we want to use in one of our projects. It seems obvious that we could just update the library and modify our project to use the new features, right? Except if we do that, we might break every other project on our machine that depends on that library, if the updated version wasn’t backwards compatible. And if the project that we’re currently working on itself depends on one of the other now-broken projects, we have to dive into a tangled web of hidden interdependencies to get anything working. This pattern, is called dependency hell, in general and is certainly not unique to cabal. There are a number of different ways to manage this problem, but the general approach is to use some other tool that allows you to install multiple copies of the same library on a per project basis so that changes in one project don’t propagate to other projects. Cabal itself provides a mechanism for doing this which allows you to install packages into self-contained “sandboxes,” but there are other tools, such as Stack (see below), which is what HPFP uses.

One extremely interesting way to solve the problem of dependency management that I should note is to use a system package manager like Nix or Guix that supports this type of package sandboxing and referentially transparent builds for anything you install on your machine, not just Haskell packages (which is all Cabal or Stack deals with). And you can even extend this to running an entire OS, like NixOs or GuixSD that manages the whole OS this way. That is, you can install the Nix package manager on e.g. macOS, but macOS packages or applications won’t be self-contained. In NixOs however, everything, including OS libraries have isolated dependencies, which makes it radically more difficult to break things when you, for example, update your operating system.

Stack: A tool for building Haskell projects and manage dependencies. I like to think of Stack as a really nice user interface over a lot of common Haskell development functionality like installing packages with cabal, or managing different versions of GHC. Stack also installs packages from Stackage which is a mirror of Hackage that makes some guarantees about packages being compatible. Libraries on Hackage are more like nightly or unstable builds, up to date, but possibly brittle. Libraries on Stackage are farther behind, but stable.

Installing Stack:

I started the book with Arch Linux as my operating system, but I switched to NixOS part of the way through. Setting up Stack on Arch was really easy, but maintaining it proved fairly nontrivial. Apparently, just naively updating GHC and Stack with pacman (Arch’s package manager) can result in some epic dependency hell.

When I investigated, I found a post somewhere (which I can’t seem to find now) that proposed the following:

Install stack with pacman

Install stack again with stack.

Use pacman to remove the system package of stack that you installed with pacman.

Use the stack that stack installed

This was too silly for me to even try, but the fact that it seemed almost plausible told me that I should probably do my Haskell work on a different OS. So I added a NixOS partition on my machine on the theory that it’s better to have a system that’s difficult to learn and easy to debug than the other way round.

Accordingly, installing Stack on NixOS in a way that things seem to mostly behave was non-trivial. But it’s been pretty smooth sailing ever since (knock on wood). Here’s a rough outline of what I did:

Installed Stack by adding stack to my system packages in my configuration.nix

Enabled Stack’s Nix integration by adding

nix:
enable: true
shell-file: shell.nix

to the stack.yaml of any project I’m using stack with. You can also add

nix:
enable: true

to your .stack/config.yaml if you want to globally enable stack’s nix integration.

The main thing this does is it allows stack to use the versions of GHC installed via nix from the nixpkgs repository. It also brings the zlib libray into scope and fixes an environment bug that prevents programs run by stack (like GHCi) from correctly displaying Unicode. There’s a lot of ways to extend this on a per-project basis, but I’ve found the above to be a good framework.

This is a pretty boilerplatey solution, but considering this book really depends on Stack I thought it better to stick to the text rather going the cabal2nix route and having a more idiosyncratic build process.

2.5 Understanding expressions

There’s a lot of inferential distance between the previous chapter and this one. I think it’s important to point out that even though Haskell is based on the lambda calculus, they are by no means the same thing. Lambda Calculus is the theoretical foundation of a lot of languages, Haskell being only one of them. But even if the foundations are the same, the structure of the languages might be very different, based on the decisions of its designers. Start with Lambda Calculus, build one kind of structure, and you get Haskell, but start over and build a different kind of structure and you get Lisp.

The book skips over a lot of those design decisions and rightly so. Better to get started playing with the language as soon as possible. But you should at least be aware that there is some hidden complexity here. And if you find yourself stuck on some concept, it’s probably because some implicit detail, which the author knows but you don’t, is blocking you.

This happens to everyone. Just remember that every big complicated concept in computer science is made up of little obvious steps. Computers can only do little obvious steps, but they do very many of them very very fast. Whenever you’re stuck, go back to the last step that was obvious to you, and try to figure out what the next obvious step is.

In this case, I highly recommend the follow-up resources in the previous chapter, which are a great place to start at trying to close the complexity gap between Haskell and the lambda calculus.

Exercises: Comprehension Check

let half x = x /2let square x = x * x

let area x = 3.14 * (square x)

let area x = pi * (square x)

2.6 Infix Operators

It may be helpful to point out that all infix operators are just syntactic sugar over functions.

syntactic sugar: Syntax within a programming language that is designed to make things easier to read or to express. It makes the language “sweeter” for human use: things can be expressed more clearly, more concisely, or in an alternative style that some may prefer. (from Wikipedia)

Exercises: Parentheses and Association

Precedence: Where the implicit parentheses are. Precedence of operators is only relevant if you leave out parentheses. If you can explicitly mark what operations you want evaluated in what order, precedence doesn’t matter. Or, in other words, parentheses have the highest precedence, just like in grade school (the P in PEMDAS stands for parentheses).

Don’t worry, GHC isn’t scolding you (much). You shouldn’t want to run redundant or contradictory code either. Much better to find out that your code is broken in a message from the compiler than, for example, when your Mars Lander crashes into the surface because your code had a type error and mixed up feet and meters (this actually happened).

As for why GHC is perfectly happy to accept contradictory declarations in GHCi let expressions but not in a source file, this is because everything in GHCi happens inside the IO () type (IO meaning input/output). IO let expressions have different scope than declarations in a source file.

This seems mysterious, but will make sense in later chapters.

Exercises: Heal the Sick

2.8 Arithmetic functions in Haskell

Oh, gosh. Modular arithmetic.

My recommendation is to read the chapter and read this Wikipedia page on the Modulo operation. You’ll see there that pretty much every programming language has its own idiosyncratic way of doing the modulo operation, and there isn’t a lot of consistency between them.

By the way, divMod and quotRem are two useful functions which return pairs of div and mod, and quot and rem, respectively, applied to the same arguments:

In my opinion I wouldn’t stress too much about all the detail here. In practice, div and mod are almost always what you need, and that the distinction between divMod and quotRem only appears when you’re dealing with negative integers. Really the difference is just what “round down” means.

Both arguments are positive, so there is no difference. 13 divided by 4 the ordinary way is three and a fourth, 3.25. But since this is integer division we round down to 3. Since 3 * 4 is 12 the remainder is one (13 - 12 = 1).

But what if we do (-13) divided by 4. If we were doing ordinary division we would get negative (-3.25).

Now this is the question: What does it mean to “round down” (-3.25)?

The two possible options are (-3) and (-4).

If we say that (-3.25) rounded down is (-3), then we run into the difficulty that (-3) is actually greater than (-3.25):

So we have a choice to make. We want integer division to “round down”, but when one of the arguments is negative, “down” can mean two different things. We have to decide if we want “down” to be toward 0, in which case we’ll choose (-3) in the above example, or towards \(-\infty\), negative infinity, in which case we choose (-4).

But I think this is less clear. In general I like to use parentheses and $ to better illuminate the structure of whatever the expression is, which depending, on the specific expression that might require slightly different styles. let and where expressions can also be a great help.