Thursday, December 19, 2013

Learning typed functional programming: obstacles & inroads

Yesterday, there was a discussion on a mailing list I'm on about a perception of gender diversity problems in the communities around functional programming, type theory, and programming language theory, even relative to other areas of computer science. After some speculation about education barriers and community approachability, I decided to conduct an informal survey on Twitter:

if you think typed/functional programming is harder/more intimidating than other kinds, can you say why in a tweet? won't argue just curious
— chrisamaphone (@chrisamaphone) December 17, 2013

You can read several of the (filtering for sarcasm/criticisms of languages) collected responses on Storify. I left "typed", "functional", and the combination thereof intentionally ambiguous because I was interested in how people interpreted those words as well as their reactions to whatever programming constructs they associated with them.

Because I promised not to argue with anyone who replied, a bunch of responses have been tumbling around in my head about how different my experiences have been from the ones described here. In general, I agree about the cultural tendencies and have observed plenty of that myself.

But I think what a lot of the non-people-focused responses seem to be telling me is that we're doing a terrible job of advertising, explaining, and demonstrating what typed-functional languages do, and especially what types are good for.

The reason I'm excited about types now, 8 or 9 years after using a Hindley-Milner type-inferred language for the first time, is that there's a correspondence with logic that gives rise to all kinds of useful and fascinating research. (By the way, this is why that one response saying "I have a logic brain but not maths" kind of broke my heart!) But if I think back to why learning ML felt like a godsend after a few years of Java, C, and C++, I remember a few different things:

- SML/NJ had a REPL. I could experiment with tiny pieces of the language and the code I was trying to write before putting them all together in a big file with a top-level entry point.

- Signatures (declarations of new types and functions) were separate from modules (implementations). I could think about the interface I wanted to program to without actually writing the code to do it.

- Algebraic/inductive datatypes and pattern matching. Ok, so this is secretly very related to Curry-Howard and the things that excite me about types now, but at the time the ability to write

datatype 'a tree = Leaf | Node of 'a * 'a tree * 'a tree

...instead of implementing a collection of functions to do pointer or field manipulation, and then immediately being able to write traversal & search functions (again without manipulating pointers or fields), just felt like a huge practical advantage. An entire data structure definition in one line! These building blocks -- type variables, recursive definitions, disjunction (|) and tupling (*) felt like they could make Lego-like structures which were so much more tangible to me than the way I would encode the same thing in C or Java. (Later I would learn that this is the difference between positive and negative types, both of which have their uses, but half of which are missing or at least highly impoverished in most OO settings.)

- Highly related to the above point, the ability to "make illegal states unrepresentable", & thus not have to write code to handle ill-formed cases when deconstructing data passed in to a function.

- The thing Tim said:

@chrisamaphone Even early on, I found *typed* programming easier because I was getting feedback from the computer...
— Tim Chevalier (@eassumption) December 19, 2013

@chrisamaphone ...doing math (on paper) makes me anxious because what if I'm doing it wrong and I'd never know? So it doesn't really...

@chrisamaphone ...matter to me whether it's functional or not, but the two tend to go together.

Over a couple of years, I found that I was less drawn to the "safety net" features of type systems and more to their utility as design tools. In other words, I found myself discovering that types let you carve out linguistic forms that map directly onto your problem rather than maintaining in your head whatever subset of the language you're using actually corresponds to the problem domain. A super basic example of this is just the ability to create a finite enumeration of, say, colors, rather than using integers, a subset of which you're using to represent different colors. But this goes way beyond base types. I think of every program I write as coming with its own DSL, and I want the language I'm using to support me programming in that DSL as directly as possible, with a type-checker to let me know when I've stepped outside my design constraints. Dependent types especially have a lot to offer in that arena.

Anyway, the point of this is not to tell folks that have struggled with learning type systems that you're wrong or that you just didn't get it. I hear "this was too hard for me" and then that you can write web code in JavaScript or a raytracer in C, and that boggles me, because that's so much harder for me. So I thought it might be elucidating to share my experience, which seems very different from yours.

I'm also directing some frustration at educators (including myself) for being apparently so bad at getting these kinds of things across, and at adapting typed languages & research projects to the needs of people who aren't already entrenched in them, that any programmer could think they are not smart enough to use them. Personally, I don't feel smart enough to do without them.

6 comments:

I feel I have not done a good job at all in effectively promoting typed functional programming, which I myself find tremendously fun, joyful, liberating, and practical. I used to think "it's not my job to market", but I don't have that view of the world any more. I would definitely like to collaborate with others who also want to spread the word in a friendly, effective way!

I am very curious what people mean when they talk about "math" or not being "good" at it, because this is sounding like an important barrier to overcome, opening up a lot of questions about the nature of early educational experiences outside of computing that have an impact on whether people even enter the field or what they choose to gravitate to once in it.

Rank speculation: A lot of people have traumatic experiences associated with math, because math is frequently taught in elementary school (computer science rarely is). In particular, math teachers at that level are usually poorly trained (due to the structural disincentives for people with math education to enter K-12 teaching) and/or lack enthusiasm for the subject.

Moreover, at that time in a person's schooling, it's common for a student to be shamed (publicly or privately) and told they're "not good at math". Because socially, math isn't considered a necessary skill (unlike reading), it's easy for a student to deal with this kind of treatment through avoidance rather than mastery. This is completely understandable for a child who has never been told why math is worth doing and has only been taught that it's a tool that will be used to humiliate them and demonstrate their inadequacy, by the way.

So when many adults -- even adults who have enough analytical reasoning ability to be programmers -- hear the word "math", they think back to those experiences, to the time when they were told "you're no good at this", and they freeze up, or else feel the need to prove why math is some useless ivory-tower theory garbage, because of their own feelings of insecurity to do with the disservice that their school system did them.

This is rank speculation because I didn't go to school until college, but I did tutor high school dropouts for a brief period of time, and over and over I'd run into a student who kept saying "I'm not good at math" even though I was there to help them be better at it.

this may come as a restatement of replies you have already received, but: when I program by using types and signatures to define the "DSL" of my codebase, it takes extra effort, and can't happen as a simple two-stage process of "first define the boundaries/interfaces, then write your program logic, with the second stage much easier thanks to the first". rather, I often don't know exactly what datatypes I'll need until much of my program logic is already written. so, taking advantage of the type system as a design tool requires jumping back and forth between hacking on expressions and polishing types. I am willing to do that extra overhead because I know in advance that it will help my code quality in the long run; that is, because I am already practiced at using the type system for this. plus, this was a skill that I slowly got better at (by doing two compilers courses in functional PLs) once I already learned functional programming. so I can see how people who aren't familiar get the perspective that "types get in the way because you have to get them completely right before you can get any real programming done".

This brings up a topic that may be relevant: module systems, and interface vs. implementation. Languages such as Standard ML, OCaml have for decades promoted "programming in the large" by means of abstract data types through signatures, so that you can actually code against the signatures before you've implemented the types at all. Did you use ML languages, and if so, how did you incorporate this into your work flow?