What it was like writing my first "real" Haskell program

I spent much of my spare time in 2018 studying Haskell. There are a number of
good resources, but it was the hardest self-teaching effort I've made.

Did I learn anything? Was it worth it?

In order to find out, I spent the first
weeks of 2019 writing two variations of a "real" program, one in Python and one
in Haskell. The program is a web scraper that collects artist and track
information from the Kahvi Collective. No, it does not
download all their music. That would be legal but really rude. (You
should check out their collection though; it's great listening when you need
to stay creative and focused.) I chose to write a scraper because I've written
a few dozen in Python for clients, and because they cover several important
areas such as data storage, network IO and parsing potentially messy input.

Python

For the Python version, I used Scrapy to
manage scraping. Scrapy is a well established framework. It includes
classes, functions and documented best practices to extract and transform
scraped data and a custom object type to represent scraped data. It supports
plugins to persist data, crawl the pages, avoid duplicating page requests, and
oh yes, a whole lot more. I've used Scrapy on a number of projects, and it
has proven to be effective and flexible.

The biggest source of bugs I've come across with Scrapy is parsing HTML. It is
common that you'll review a site and anticipate that the data will be presented
with specific tags. You may collect several sample pages and build tests
around them, only to find out during a live run that you missed an edge case,
or that the site itself is inconsistent.

For example, if you look at the
first release on Kahvi,
you'll see that the artist name has a link to an artist page. The
eighth has no such link.
That changes how you need to access the artist's name, and what information can be
retrieved about a given artist. Mistakes here can lead to exceptions being
thrown in production, or missing data when Scrapy successfully plows through.

Overall, the fact that I cited bad input as the biggest source of trouble
with Scrapy should tell you that it's a solid framework that does the job well.

Haskell

For Haskell I used the Scalpel
library to manage scraping. Scalpel is a library rather than a framework.
Scalpel provides functions to retrieve HTML pages and extract tags and text.
Anything beyond that (storage, parsing, transformations) are left to the
programmer.

This was my first time writing any program in Haskell. Getting it to work, at
least this first time, was a challenge.

The biggest show-stopper happened when I got stuck trying to work with
Haskell's strict type system.
I had a hard time correctly combining pieces of data scraped from the HTML
while simultaneously addressing the possibility of data being unavailable.

If you think on that a moment, you may realize that
this is the exact same problem I had with the Python scraper! The difference
wast that Haskell's type system simply would not allow me to compile
until I fully addressed how to handle the Nothing result. Combining
possible Nothings was tough for me to get right. I eventually e-mailed
Scalpel's author to ask for a hand. He was able to help me work through it.
Thanks!

The first time it compiled, it worked.

One thing I've noticed with Haskell is that even though the type system has a
steep learning curve, it's extremely consistant. The solution I was taught
for dealing with Nothing values did not rely on the Scalpel tool at all,
and could readily by applied to almost any scenario where one needs to build
up a piece of data from multiple inputs in Haskell.

Summary

Python's framework was a more complete, batteries included solution. The
learning curve to start applying Python felt much less steep than it did with
Haskell.

Haskell's libraries were more than adequate to get the job done, and less
"opinionated" about how you could put the parts together. Haskell was also
much less prone to unexpected run-time errors. It would be possible to
compile a Haskell program that looked for an HTML comment and got Nothing,
but it would tough to trick Haskell into compiling a program that
didn't have a specific, intentional way to handle the Nothing case.

Was it worth it?

Pragmatically... maybe?

It seems to me that the practice of thinking about types and
separating pure functionality from side effects could be helpful and
transferable while working it other languages. It's also very clear Haskell
has some real advantages in terms of preventing runtime bugs. Both of those
ought to be worth something, but I don't know how hard it would be to land a
Haskell job or if the benefits justify a pay bump. I might not suggest
learning Haskell if your biggest concerns are getting started quickly or making
money, unless you know it's necessary for a particular industry. I hope to
complete more Haskell projects in the future and may have more to say about it
pragmatically in the future.

Does fun count?

If so, then Haskell was absolutely worth it. Before I learned Haskell I'd
seen the term composability tossed around, but didn't have a practical sense
of it. In practice, it means being able to string together numerous operations
on a single piece of data. Here's an easy snippet I found fun:

strip.takeWhile('/'/=).drop1.dropWhile(':'/=)

That's four functions strung together that would take "#413: Jim Black /
Elysian Underground," and return "Jim Black." Not counting my English
punctuation...! Haskell uses techniques this all over the place, not just for
strings and not just for doing one thing after another. Because the language
is so consistent and compostable, overcoming the challenges and getting to the
fun parts is a real joy.