Paul Chiusano's blog - FPPosts categorized as 'fp', which relate to functional programminghttp://pchiusano.github.io
If Haskell is so great, why hasn&apos;t it taken over the world? And the curious case of Go.<p>Programming is all about managing complexity. Without good tools to manage it, the complexity of programs becomes mentally intractable for our limited brains and we’d lose control and understanding of our programs (imagine writing a big software system entirely in assembly language).</p>
<p>And so the history of programming has been a series of advancements in both <em>removing barriers to composability</em>, and <em>building new programming technologies that better facilitate composition</em>. To the extent that software has <em>compositional structure</em> (as opposed to <em>monolithic structure</em>), it can be understood and managed by our limited brains, and we can build more complex software via composition of smaller pieces. Also very important is that composable artifacts can be assembled by thousands of people in loose communication, often working in parallel, whereas monolithic artifacts require small teams in close communication, often working sequentially.</p>
<p>So we’ve proceed in stages:</p>
<ul>
<li>Stage 0: Composability is the most important thing; <em>without it, complexity swallows us</em></li>
<li>Stage 1: Composability requires atomic units of composition with means of combination, <em>therefore functions</em></li>
<li>Stage 2: Composability is limited by side effects, <em>therefore pure functions and functional programming</em></li>
<li>Stage 3: Composability without mechanized reasoning becomes difficult for humans to track, <em>therefore static types</em></li>
<li>Stage 4: Composability is destroyed at program boundaries, <em>therefore extend these boundaries outward, until all the computational resources of civilization are joined in a single planetary-scale computer… this is the idea of <a href="http://unisonweb.org">Unison</a> and <a href="http://unison.cloud">unison.cloud</a></em>.</li>
<li>…</li>
<li>Stage N: We’ll get back to this at the end of this post</li>
</ul>
<p>Pause here. While you can definitely still compose programs from impure functions, doing so is less flexible and also more complicated for the programmer. I’m not saying any form of composition is impossible with impure functions; I’m saying it’s more difficult (for instance, it requires non-local reasoning). Likewise for the other stages. I’m not saying you have no composability without static types; I’m saying that static types more easily facilitate composition given our limited brains. (Also see: <a href="#turing-tarpit">Turing tarpit arguments</a>)</p>
<h2 id="if-haskell-or-xyz-is-so-great-why-hasnt-it-taken-over">If Haskell (or XYZ) is so great, why hasn’t it taken over?</h2>
<p>I’ve <a href="/2016-02-25/tech-adoption.html">written some posts about tech adoption generally</a>. Now, I’ll give a thesis to explain a nettlesome question:</p>
<blockquote>
<p>If Haskell is so great, why hasn’t it taken over the world?</p>
</blockquote>
<p>But pick any non-mainstream tech that you think is better, it doesn’t have to be Haskell. You can invent all kinds of responses:</p>
<ul>
<li>“It really IS taking over the world, even Java has lambdas (or <code class="highlighter-rouge">&lt;feature-related-to-my-pet-tech&gt;</code>) now!!” (okay, but let’s be real, pure FP at large scale is still not that common)</li>
<li>“It really IS taking over the world, just very slowly…” (I mean, maybe, but you could have said the same thing 10 years ago; how do you know you aren’t fooling yourself?)</li>
<li>“Everyone else outside my little tribe is a moron!” (You really shouldn’t think this… but even if you did think it, you should ask yourself: if Haskell were really that much better, wouldn’t eventually these supposed “morons” be convinced by the mountain of incontrovertible evidence that accumulated in favor of your preferred technology’s vast superiority??)</li>
<li>“It’s too hard to learn” (If your pet technology were 1000x more productive than, say, Java, would this learning curve really be a substantive barrier? Though this question is more complicated than you think—see <a href="/2016-02-25/tech-adoption.html">the tech adoption post</a>—if the multiplier is big enough, will these complexities matter? Imagine if DVORAK or some other keyboard layout were 1000x faster than QWERTY for typing.)</li>
<li>“It’s all about MARKETING. JavaScript has better marketing!! And its projects have cooler logos on their landing pages!” (Okay, now you’re really reaching…)</li>
</ul>
<p>The simplest explanation is probably that Haskell is not <em>that</em> much better than, say, Java, <em>for many of the software systems people write today</em>. Why might this be?</p>
<p>The reason I’ll give is that Haskell’s otherwise excellent composability is destroyed at I/O boundaries, <em>just like every other language</em>. That is, we are at stage 4 above, where the bottleneck to further composition is these program boundaries. Since most software systems (especially those that span multiple nodes), have a large surface area in contact with the outside world, the code devoted to merely getting information at these boundaries into some more computable form is often the bulk of the work; once the data is in computable form, the actual computation needing to be done is easy.</p>
<p>David MacIver has <a href="http://www.drmaciver.com/2015/04/on-haskell-ruby-and-cards-against-humanity/">this quip about early Haskell enthusiasm</a>:</p>
<blockquote>
<p>“Look, I used a monad! And defined my own type class for custom folding of data! Isn’t that amazing?“. “What does it do?” “It’s a CRUD app”.</p>
</blockquote>
<p>If you’re writing a CRUD app, or some other computationally boring system that has a large, complex surface area in contact with the outside world, writing code to deal with that program boundary often dominates the codebase.</p>
<p>Where we see Haskell (or more generally, typed FP) excel is for programs that have minimal surface area in contact with the outside world, but with a large amount of interesting computation happening internally. A good example: compilers. Compilers don’t have much interaction with the outside world—just reading some files—but have lots of interesting computation happening internally, for things like typechecking, code generation, and so on. Haskell excels here; I would not be surprised if Haskell were 100x better than Java for writing compilers. Writing CRUD apps? Haskell isn’t as much of a win.</p>
<p>I think this hypothesis also offers an explanation for why Go is popular, even though the language is “boring” and could have been designed in the 1970s. Go has found a niche as basically “a better C” or “a better Java” for writing high-performance servers that do lots of I/O. Unlike C or Java, it has a much more high-level I/O and concurrency story, but the language itself is otherwise very familiar to people with a background in these and other mainstream languages. Thus it serves a niche that wasn’t previously well-covered.</p>
<p>As soon as you need to be defining lots of complex or interesting computations, you start needing languages with good support for composability to manage that complexity. Here Go fails, for all the reasons that people have criticized it. But there’s still a good chunk of services where Go can do quite well!</p>
<p>Haskell programmers might object that, well, Haskell has its own very nice I/O and concurrency story, in many ways more sophisticated than Go (things like software-transactional memory, which make writing highly concurrent data structures and algorithms much simpler). But Haskell is “weird”. A C, Java, Python, or Ruby programmer can pick up Go easily. They can’t pick up Haskell so easily, as even in beginner Haskell, you are immediately confronted with lots of unfamiliar concepts. And since Haskell isn’t enough of a win for these “boring” services, Go can still make sense.</p>
<h2 id="whats-next">What’s next?</h2>
<p>The <a href="http://unisonweb.org">Unison programming language</a>, and the <a href="http://unison.cloud">unison.cloud</a> platform I’d like to build around it, is my effort to move programming beyond Stage 4. By removing any friction and non-uniformity when programming multi-node software systems, such systems can once be assembled in a compositional fashion. The better composability of typed, pure FP once again becomes a significant lever, because process boundaries no longer destroy composition.</p>
<p>I wonder what comes after that? When all the obvious barriers to composability have been removed, the ‘composability bottleneck’ must move somewhere else, somewhere that might not be obvious. Like where? Time for some vague speculation…</p>
<p>One other problem we have today is that composability is destroyed at “application” boundaries, at the interface between humans and our programs. We write a bunch of “backend code” in a compositional fashion, then build a bespoke, single purpose UI for interacting with some ad hoc subset of this, which is a dead end for further composition. (<a href="https://www.youtube.com/watch?v=faJ8N0giqzw">See this Conal Elliott talk on this</a>) This is a problem, and it can be solved.</p>
<p>But even if we move beyond that, there will be other composability bottlenecks, and when those are removed, there will be others, and on and on, and in the end… well, I don’t know.</p>
<p>What do you think?</p>
<h2 id="appendix-turing-tarpit-arguments"><a id="turing-tarpit"></a>Appendix: Turing tarpit arguments</h2>
<p>There’s a kind of argument that comes up a lot in discussion of programming languages. I call it a “<a href="https://en.wikipedia.org/wiki/Turing_tarpit">Turing Tarpit</a>” argument: programming tech A isn’t really better than tech B, it’s just that A is a bit more convenient than B for a few hand-picked little examples (implicitly: “Big deal, who cares?”).</p>
<blockquote>
<p>Beware of the Turing tar-pit in which everything is possible but nothing of interest is easy..</p>
</blockquote>
<p>The trouble is that very often, the sorts of examples that are easy to discuss aren’t of sufficient scale to reveal any major differences between A and B. It’s only when building much larger systems that the difference become more than “a little convenience”. That is, Turing tarpit arguments skip doing any analysis of how or whether the “little more convenience” might becomes bigger as N gets larger, and tacitly assumes that any language that’s Turing complete is just as good as any other. It’s a bit like saying: “Oh, geez, this heapsort algorithm you’ve got seems rather baroque and complicated. My insertion sort algorithm runs just as fast on this 10 element list.”</p>
<p>Imagine traveling back in time to the days before C, and trying to convince an assembly language programmer that C was a massive step forward for programming. <em>In principle</em>, you could build arbitrary programs by gluing together hand-written fragments of x86 assembly language. In practice, fragments of assembly language aren’t very composable <em>given the limitations of our brains</em>. But you might have a hard time convincing the assembly language programmer of this, because toy examples of the sort that are easy to discuss would not reveal any major differences.</p>
<p>What WAS likely convincing to assembly language programmers was the idea of not having to write the same program 5 times, for each different hardware architecture. This was a clear productivity boost that was immediately understandable to anyone who wrote assembly language and needed to target different architectures. And this huge advantage was enough to get “high-level” languages like C in the door. With time and experience using C, the more subtle, abstract benefits of increased composability of C over assembly language would become more apparent.</p>
Fri, 20 Jan 2017 00:00:00 +0000http://pchiusano.github.io/2017-01-20/why-not-haskell.html
http://pchiusano.github.io/2017-01-20/why-not-haskell.htmlThe advantages of static typing, simply stated<p>This post summarizes the advantages (and drawbacks) to static typing, as simply as possible. None of these arguments are new or my own, but I wanted to consolidate them in one place.</p>
<p>First, the advantages:</p>
<ul>
<li>
<p><em>A large class of errors are caught, earlier in the development process, closer to the location where they are introduced.</em> <strong>Example:</strong> you have a map whose keys are strings and whose values you expect (elsewhere in your program) to be functions from <code class="highlighter-rouge">Int -&gt; Int</code>, and you accidentally insert a function that returns a <em>list</em> of integers. Oops! With static typing, this error would be caught by the typechecker and pointed out to the programmer at the location the erroneous value is inserted. In a dynamic language, the error could go unnoticed until much later, when that value gets pulled out of the map and the function applied. As another example of this sort of thing, many languages have a concept of <code class="highlighter-rouge">null</code> (the <a href="https://www.infoq.com/presentations/Null-References-The-Billion-Dollar-Mistake-Tony-Hoare">‘billion dollar mistake’</a>). Even static languages like Java allow <code class="highlighter-rouge">null</code> to be used in place of any type, which leads to a situations where unexpected <code class="highlighter-rouge">null</code> values pop up in your program very far from the place where they were erroneously introduced.</p>
</li>
<li>
<p><em>The types guide development and can even write code for you—the programmer ends up with less to specify.</em> <strong>Example:</strong> Haskell has a feature called typeclasses, which derives aspects of your program, purely based on the types, which are often inferred. The resulting code can be noticeably shorter than dynamic languages which cannot disambiguate programmer intent via types. More advanced dependently typed languages routinely derive large amounts of boring code with guidance from the programmer and the types specified. <a href="http://strictlypositive.org/">Conor McBride</a> likes to emphasize that types are not merely about preventing errors—they are about declaring more of your intent to the machine so the machine can do more work on your behalf.</p>
</li>
<li>
<p><em>One can refactor with greater confidence</em>, since a large class of errors introduced during refactoring will end up as type errors. <strong>Example:</strong> you have a function that currently returns a single number, and wish to modify it to return a list of numbers. In a static language, updating the declared type signature and fixing any compile errors will catch most, if not all places that need updating.</p>
</li>
<li>
<p><em>Static types can ease the mental burden of writing programs</em>, by automatically tracking information the programmer would otherwise have to track mentally in some fashion. <strong>Example:</strong> you are writing a tricky function, and have two values in scope, <code class="highlighter-rouge">f</code> and <code class="highlighter-rouge">x</code>. The <code class="highlighter-rouge">f</code> you pulled out of a dictionary, the <code class="highlighter-rouge">x</code> was the result of calling <code class="highlighter-rouge">foo(23)</code>. Is it safe to call <code class="highlighter-rouge">f(x)</code>? Types provide an automated, precise answer to this question.</p>
</li>
<li>
<p><em>Types serve as documentation for yourself and other programmers and provide a ‘gradient’ that tells you what terms make sense to write.</em> For someone trained, the types give a sense of what is expressible using any API. Like puzzle pieces with shapes that we can observe fit together, we can think of types as specifying a grammar for programs that ‘make sense’. In dynamic languages (what Bob Harper somewhat trollishly refers to as <a href="https://existentialtype.wordpress.com/2011/03/19/dynamic-languages-are-static-languages/">‘singly-typed’ languages</a>), information about what programs make sense needs to be communicated in other ways. <strong>Example:</strong> many libraries have documentation with examples. Great. But if one of the examples has the line <code class="highlighter-rouge">read(message, peer)</code>, it’s often difficult to ascertain whether a related expression, like <code class="highlighter-rouge">read(fileChunkStream(file1), peer)</code>, is also valid. Types provide immediate answers to these questions, especially for someone trained at reading typeful APIs.</p>
</li>
</ul>
<p>I haven’t mentioned performance, but I’d say it’s easier to efficiently execute programs that are statically typed. Dynamic languages executed naively will have much more dynamic dispatch and runtime checks happening. And while it’s possible to execute dynamic languages efficiently, it’s more complicated. You often need some sort of JIT to get really good performance, whereas static languages can do very well with ahead-of-time compilation. Note that I don’t think performance alone is a great reason to prefer static to dynamic—you can get very good performance from dynamic languages with a good JIT (see <a href="http://luajit.org/">LuaJIT</a>), and that development cost only needs to be paid once. In contrast, the above bullet points affect everyone using the language, pretty much all the time.</p>
<p>Now for some drawbacks to static typing:</p>
<ul>
<li>Like any formalism, types require some investment up front to become fluent in.</li>
<li>Type errors are frequently poor. Thus, even though errors are caught sooner, the way they are reported to the user can be <a href="/2015-03-26/type-errors.html">frustratingly opaque</a>, which isn’t good for programmer motivation. In the world of static typing enthusiasts, there’s also a segment of people who will state or imply that anyone who gets frustrated by <a href="/2015-03-26/type-errors.html">the user experience of fixing type errors</a> is dumb, not a “real programmer”, etc. There are also <a href="/2016-02-25/tech-adoption.html">complexity apologists</a> who will defend the “type error user experience”.</li>
<li>Static typing is a constraint on your program’s structure. How limiting or liberating these constraints are is up for debate, but some people will argue it’s a big deal.</li>
<li>Some tasks, especially around generic programming, can be very easily expressed in a dynamic language, but require more machinery in a static language. For instance, a generic serialization library can be written in a dynamic language, without anything fancy, but providing the same thing in a static language requires more machinery (see for instance <a href="https://wiki.haskell.org/Generics">Haskell’s generics support</a>), and is sometimes more complicated to use.</li>
<li>It can be difficult to assign static types to some programs, and learning how best to carve a program up into types is a skill that takes years to master. This is sort of a subtle point: when programming in a static language, you always have a choice about what information you encode in the types, and how you encode things. For instance, you can create and use a nonempty list type and have the compiler check it statically or you can let nonemptiness be a dynamic property that you have to check for at runtime. You can have lazy, potentially infinite strings be a distinct type from strict, finite strings, or you can lump both concepts into the same type. And so on. In a static language, you do always have the option of building a less typeful API, where less is enforced by the types, but it is often tempting to spend more time encoding things statically (and then proving things to the typechecker) than would be saved by avoidance of potential future bugs. With experience, you develop a good sense for what is worth tracking statically and what to keep dynamic, but newcomers to static languages can make bad tradeoffs here, which in turn contributes to needless complexity in the language’s library ecosystem. Haskell has this problem IMO, even though on the whole, I find Haskell has some very high quality libraries.</li>
<li>To put this last point another way, with static types, you have to make <em>more choices</em>, and you may have to make these choices at a point in time where you’re unsure how to decide, or when the choice feels distracting.</li>
</ul>
<p>I think there is room for reasonable people to differ about the <em>degree</em> to which these advantages and drawbacks matter, and it probably can depend on the person. It does seem, though, that once you learn typeful programming in a language with a very good type system, it’s very unlikely you’ll go back to dynamic languages. Why? With more expertise, the drawbacks are minimized, and you better leverage the advantages. On the other hand, some people never get over the hump. Are there things we can do about that? More on that in a minute.</p>
<h3 id="a-personal-story">A personal story</h3>
<p>I learned programming with C++, in college. Later I learned Java. I didn’t have much of an opinion of static vs dynamic typing until I was introduced to Python, which felt like the breath of fresh air I didn’t know I needed. Until you are introduced to an alternative, it can be difficult to see the things that are problematic with the status quo. After learning Python, all the <em>syntax</em> devoted to types in C++ and Java seemed superfluous. DHH had some quip about lines like <code class="highlighter-rouge">Employe employee = new Employee("Bob", "Smith");</code> in Java that made me go “yeah, totally!!!” I was done with static types for a while.</p>
<p>Fast forward a couple years. I’d learned some other languages: Lua, Mathematica, a bit of Scheme, some Prolog (all dynamically typed, notice!). If you’d asked me back then, I’d have said static languages were crap. Java was a punchline. There was a bit of exposure to functional programming via those languages, which led to me hearing about Haskell.</p>
<p>Haskell, and more generally, typeful languages with good type inference really changed the cost/benefit ratio of static typing.
Most of the dynamic language programs I’d written could be given Haskell types, with zero type annotations if I wanted! This largely eliminates any syntactic advantage to dynamic typing.
Of course, over time, you learn that it’s often <em>helpful</em> to declare your types to the compiler, so it can better tell you how you’ve screwed up.</p>
<p>These days, while I do still think there room for improvement in the ‘typeful programming’ user experience (which is part of why I’m working on <a href="http://unisonweb.org">Unison</a>), at this point, I’d never go back to dynamic typing. I’m much more excited to improve the user experience for programming in a statically-typed language, so we get all the great benefits, and less of the drawbacks!</p>
<p>One final observation: when I first started doing statically-typed FP, in Haskell and Scala, it took sometimes great effort to figure out how to express programs in a way that would satisfy the typechecker. Type errors were often bewildering to me. And the act of carving up a program into types took effort. But it felt like the good kind of effort, and I kept at it.</p>
<p>Now, these things are almost like breathing. I know what is easy to express in the types. I know lots of ways of encoding things and which ones are likely to work nicely for different scenarios. I don’t struggle with it. Picking types isn’t separate from the act of design. Instead, the design process is intermingled with picking types, and the types help guide the design.</p>
<p><a href="http://blog.higher-order.com/">Rúnar Bjarnason</a> made a remark about functional programming that applies pretty well to static vs dynamic typing: he compared functional programming (programming without side effects) to driving on roads. Yes, it can be a constraint. You can’t just drive through a river and over a mountain in a straight line from point A to point B. You have to stay on the roads, learn the traffic laws, and so on. But you still get to your destination, and you probably get there faster.</p>
<p><em>If you liked this, also check out:</em></p>
<ul>
<li><a href="/2016-02-25/tech-adoption.html">Design for experts; accomodate beginners</a> talks about tech adoption, and how better tech can adapt and thrive in the real world</li>
<li><a href="http://pchiusano.github.io/2015-04-23/unison-update7.html#usability">What is usability?</a> discusses usability vs expressiveness</li>
<li><a href="/2014-10-23/learning.html">Good teaching makes long-term investments in the learner</a> talks about learning and teaching as an investment in future productivity</li>
<li><a href="http://unisonweb.org/about">The Unison project</a>: a statically-typed language with a richly-interactive semantic editor</li>
</ul>
Thu, 15 Sep 2016 00:00:00 +0000http://pchiusano.github.io/2016-09-15/static-vs-dynamic.html
http://pchiusano.github.io/2016-09-15/static-vs-dynamic.htmlNeedlessly confusing names<p>Names and the little details of an API matter. I’m going to pick on one small example I came across recently. In <a href="https://hackage.haskell.org/package/unagi-chan-0.4.0.0/docs/Control-Concurrent-Chan-Unagi.html">unagi-chan</a>, a very fast lock-free queue implementation, we have the following API:</p>
<pre><code class="language-Haskell">newChan :: IO (InChan a, OutChan a)
</code></pre>
<p>Here’s me:</p>
<p><em>Huh, okay, they split the queue into separate read and write access. Geez, was that really necessary? If I wanted to partially apply <code class="highlighter-rouge">readChan</code> or <code class="highlighter-rouge">tryReadChan</code> and pass around an <code class="highlighter-rouge">IO a</code> or <code class="highlighter-rouge">IO (Maybe a)</code>, I could do that, you know. Now if I need both read and write access, you’ve forced me to pass around a pair… ugly. But okay, okay, fine, I can deal with this, this thing’s supposed to be screaming fast, I can put up with a few admittedly subjective API warts, and honestly I’m being pretty ungrateful that the people wrote this nice code and made it freely available.</em></p>
<p>Two minutes later: a type error when trying to read from an <code class="highlighter-rouge">InChan</code>. <em>Wait, what? Let me re-check hackage docs here… WAT? <code class="highlighter-rouge">readChan</code> takes an <code class="highlighter-rouge">OutChan</code>??</em></p>
<pre><code class="language-Haskell">readChan :: OutChan a -&gt; IO a
</code></pre>
<p><em>WTF!!?! What kind of idiot would—oh, wait, maybe that does make sense. I’m probably just being dense here. Calm down, Paul. Let’s think about this, do you read from the <code class="highlighter-rouge">InChan</code> or the <code class="highlighter-rouge">OutChan</code>? Obviously I should read from the <code class="highlighter-rouge">InChan</code>… but AHA! From the perspective of the producer, the <code class="highlighter-rouge">InChan</code> IS the <code class="highlighter-rouge">OutChan</code>. But from the perspective of the consumer, it’s the reverse! This is so fucking confusing! What the hell was I even doing? I’ve forgotten by now…</em></p>
<p><em>(After a moment of realization). You know, it’s completely arbitrary whether the <code class="highlighter-rouge">InChan</code> is from the perspective of a queue writer or a queue reader, so these names are just confusing as hell. Calling these two types <code class="highlighter-rouge">WootChan</code> and <code class="highlighter-rouge">Woot2Chan</code> would have been MORE INFORMATIVE, since it would not have tempted me to start writing code based on my assumed understanding of the meaning of <code class="highlighter-rouge">In</code> and <code class="highlighter-rouge">Out</code> without checking the type signatures!</em></p>
<p><em>Not to mention, while I’m trashing this code, who decided to call those functions <code class="highlighter-rouge">readChan</code> and <code class="highlighter-rouge">writeChan</code>? Where exactly are we writing to / reading from? Is this a stack or a queue? How about <code class="highlighter-rouge">enqueue</code> and <code class="highlighter-rouge">dequeue</code>, which is more appropriately suggestive of what’s happening?</em></p>
<p>Five minutes later, I’d settled on this: if you really insist on splitting up the queue into two types, how about these names:</p>
<pre><code class="language-Haskell">newQueue :: IO (Enqueue a, Dequeue a)
enqueue :: Enqueue a -&gt; a -&gt; IO ()
dequeue :: Dequeue a -&gt; IO a
</code></pre>
<p>Now THIS is the sort of API I like. I can use it without switching on random parts of my brain that have nothing to do with what I’m trying to accomplish.</p>
<p>Summary:</p>
<ul>
<li><em>Don’t make me think (about the wrong stuff).</em></li>
<li>Don’t make things needlessly complicated or confusing.</li>
<li>Don’t tolerate the complexity apologists who claim these mental speed bumps are “no big deal”, or who say things like:
<ul>
<li>“just RTFM”</li>
<li>“fool, I just read the type signatures, ignore all names, and am enlightened”</li>
<li>“we’re all professionals here; stop whining and get on with writing MOAR codez!!1”</li>
</ul>
</li>
</ul>
<p>Also see this section of <a href="http://pchiusano.github.io/2016-02-25/tech-adoption.html#alternatives">design for experts; accomodate beginners</a>.</p>
Mon, 18 Jul 2016 00:00:00 +0000http://pchiusano.github.io/2016-07-18/needlessly-complicated.html
http://pchiusano.github.io/2016-07-18/needlessly-complicated.htmlUpcoming release of FS2 (formerly scalaz-stream)<p><a href="http://fs2.co">FS2: Functional Streams for Scala</a> is nearing the 0.9 release finally, and the first 0.9 milestone release came out this week!</p>
<p>This release was a near total redesign and reimplementation, drawing on all the good ideas (and learning from past mistakes) from years of work on streaming libraries. Some highlights:</p>
<ul>
<li>Zero dependencies</li>
<li>Hugely increased expressiveness when pulling from one or more streams, including better support for concurrency and asynchronous processing</li>
<li>Much better support for chunking, support for working with unboxed chunks of primitives and preserving chunkiness throughout a multistage pipeline</li>
<li>A simpler core set of primitives, in terms of which all library functionality is defined as regular ‘userspace’ code</li>
<li>Careful attention to preserving resource safety, even in code that allocates resources asynchronously. This is extremely tricky to support, but FS2 takes care of all the heavy lifting for you!</li>
<li>Lots more</li>
</ul>
<p><a href="http://fs2.co">The README</a> has more info, and be sure to check out <a href="https://github.com/functional-streams-for-scala/fs2/blob/series/0.9/docs/guide.md">the new user guide</a>. I expect the library API to be quite stable going forward and hope it will become one of the bedrock libraries of the Scala ecosystem.</p>
<p>Lastly, a quick shoutout: In addition to the other FS2 contributors, I’ve been working closely with <a href="http://mpilquist.github.io/">Mike Pilquist</a> on this. Mike’s company, CCAD, has very generously sponsored a lot of my work on it and is using the library heavily. Mike and the rest of the team there have been great to work with, so if you like this sort of thing and are looking for a job, <a href="mailto:mpilquist@ccadllc.us">get in touch with Mike</a>.</p>
Tue, 17 May 2016 00:00:00 +0000http://pchiusano.github.io/2016-05-17/fs2-release.html
http://pchiusano.github.io/2016-05-17/fs2-release.htmlDesign for experts; accomodate beginners<p>Technologists like to think about tools from the persepective of technical merits (“Lisp is better than C—it can do XYZ and C can’t!”). Lots of arguments take place at this level. Here, I want to consider another perspective. Let’s think of new technology like we would a new species entering an ecosystem. From this perspective, what matters is whether the species has attributes that allow it to survive and propagate itself. For new “technology species”, surviving means attracting and retaining development resources (people, money, time, etc), and propagating means increasing adoption.</p>
<p>It’s a harsh, cruel world. A world where technological species emerge and die off, seemingly at random—technical features are relevant only insofar as they help with survival. A world where <a href="/2014-10-13/worseisworse.html">worse is worse</a> but “worse” (in terms of technical capabilities) can often survive better than better. A world where <a href="http://www.quotationspage.com/quote/38206.html">“the market can remain irrational far longer than you can remain solvent”</a>.</p>
<p>So for now let us put aside whatever personal feelings we might have about what technology is better (in our imaginary world where everyone adopted the tech we think is best) and consider this real world. It’s ugly and brutal. But it’s reality. How can new technology (and we <em>would</em> really prefer it be good technology) adapt and thrive in this reality?</p>
<p>This post looks at the question from just one angle. Is it better for survival that a technology be designed for beginners or experts? Note that I am <em>not</em> asking “do you personally prefer tools designed for experts or for beginners?” I am asking what is better for <em>survival and propagation</em>.</p>
<p>The situation is pretty nuanced. Let’s put aside any talk of beginners or experts for a moment and think in terms of <em>learning curves</em>. Every technology has a learning curve. The x-axis is “time spent learning”. Let’s call the y-axis “productivity”, basically tracking how much you can accomplish with the tool. Disclaimer: I won’t claim any of these are new ideas, this is more an exercise in clarifying how we think about things.</p>
<p>A tool you already know how to use has a learning curve which is a flat line. You already know everything about the tool, and your productivity with the tool doesn’t go up with time: (DANGER: bad ascii art graphs to follow!!)</p>
<div class="highlighter-rouge"><pre class="highlight"><code> |
|
productivity |_____________ 20p
|
|
-------------
time
</code></pre>
</div>
<p>The <code class="highlighter-rouge">20p</code> is short for “20 productivity units”. Yes, that sounds silly. Stay with me. A tool which is <em>limited in its capabilities</em> might have a very short period of learning, then a flat curve:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> limited-tech
|
|
productivity |
| ____________ 10p
|/
-------------
time
</code></pre>
</div>
<p>When the line slopes upward, you are learning.</p>
<p>A tool which is very powerful may have a very long, steep learning curve. People use the sloppy phrase “steep learning curve” a lot, but that’s not a bad thing at all. Steepness in the curve could mean you are learning a lot very quickly:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> powerful-tech
| _ 100p
| /
| /
| /
| /
| /
productivity | /
| /
|/
-------------
time
</code></pre>
</div>
<p>Notice that productivity of <code class="highlighter-rouge">powerful-tech</code> matches <code class="highlighter-rouge">limited-tech</code> at each point in time, then blows past it. The “final” productivity is 10x more. Yay! Since there is never a point in time where one is more productive with <code class="highlighter-rouge">limited-tech</code>, all else being equal, from the perspective of adoptability, <code class="highlighter-rouge">powerful-tech</code> completely dominates <code class="highlighter-rouge">limited-tech</code>.</p>
<p>What’s bad for adoption is not steepness, it’s <em>nonlinearity in the learning curve</em>. Here’s a hypothetical learning curve for some powerful tool:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> difficult-powerful-tech
| _ 1000p
| /
| /
| /
| ... /
| /
|_ _/
| \ /
| \____________/
----------------------------------
time
</code></pre>
</div>
<p>Obviously, these numbers and even the shape of the curve are completely made up. Again, bear with me. With a curve like this, your productivity might go <em>down</em> compared to the usual way you might do things. You then do lots of learning for a while, but the learning doesn’t manifest as increased productivity. You’re laying the foundations of a huge skyscraper and the work is mostly invisible. Eventually, you reach a point where your foundation is complete, and you can actually start building. Your productivity rises rapidly and you also have the mental tools needed to absorb new concepts very easily so the slope of your learning increases as well. You zip past your old productivity before you started learning the tech and eventually reach a point where you’re 10 or 100x more productive than you were previously. Awesome! Unfortunately, it’s taken a long time to get there. And in “microbenchmarks” or toy problems of the sort that are easy to discuss, the less productive tool seems to win out, break even, or be only marginally worse, so you have a hard time concocting simple, compelling examples to convince anyone else that this <code class="highlighter-rouge">difficult-powerful-tech</code> is really worth learning. People on the outside start to think of you as some sort of irrational zealot, weirdly attached to your pet technology. Sounding familiar?</p>
<p>Lots of technologies have learning curves like this and they often don’t amount to more than a niche technology unless the final productivity multiplier is really huge. There are lots of factors in play here, and how they interact is interesting.</p>
<p>First, people like getting <em>feedback</em> when they are learning. With a highly nonlinear learning curve, feedback is much more indirect. Someone who is willing to deal with a highly nonlinear learning curve copes by A) believing strongly in the end results they’ll eventually be able to achieve and B) enjoying learning for its own sake, even when it doesn’t immediately lead to enhanced productivity. Let’s face it—such people are a minority. Perhaps this is a sad failure of our education system, but it’s also the current reality.</p>
<p>Next, if the final multiplier is huge, then we might be tempted to conclude that regardless of whether a technology is difficult to learn, people and businesses who wish to remain competitive will learn it. But, maybe not. Why is that? Well, a business can often achieve the same net productivity by employing <em>more people using less powerful tools</em>. Each individual is less productive, but throw enough people at the problem and stuff gets accomplished! Alan Kay quipped that <a href="https://en.wikiquote.org/wiki/Alan_Kay#A_Conversation_with_Alan_Kay.2C_2004-05">“most software today is very much like an Egyptian pyramid with millions of bricks piled on top of each other, with no structural integrity, but just done by brute force and thousands of slaves.”</a></p>
<p>Look at companies like Google and Facebook. They are building software systems largely using tools that were or could have been written 30 years ago (PHP, C++, Java, Python, Go). Are they just acting irrationally? Why don’t they get with the program and use some modern tech?? But it’s more complicated than that. Even if we ignore the massive switching costs they’d face in migrating to some alternatives, it isn’t even clear that these companies should just use the tech that is “the most powerful” (after acquiring deep expertise):</p>
<ul>
<li>On the one hand, using less powerful technology means they have to hire more programmers to accomplish the same tasks. There’s additional communication overhead to having more programmers, and more technical debt (technical debt is much easier to create with more limited technology, has a higher “interest rate”, and is harder to pay down). Score one point in favor of more powerful tools.</li>
<li>But in using less powerful tools with a “better” learning curve (in the sense of at least having higher <em>short-term</em> productivity and less nonlinearity, as discussed above), they also have less training costs, a much larger (and basically fungible) pool of qualified workers, and probably get away with paying less in salary than they might have to otherwise. This is mostly a function of that nonlinear learning curve.</li>
<li>They also might have more concurrency in development. Perhaps there is more duplication and technical debt by hiring more programmers to accomplish the same thing, but how does overall productivity compare?</li>
<li>That is, the objective of companies like Google and Facebook is not to maximize what individual programmers can accomplish, it’s closer to: maximize overall productivity of the organization in comparison to the competition, while keeping costs low enough to pay expenses and/or raise money.</li>
</ul>
<p>To make this a bit more concrete, consider two businesses, Business A and Business B, both competing to build some massive software system for a new niche. Business A decides to use Tech A with a highly nonlinear learning curve which we’ll assume to be more productive given sufficient expertise but <em>less</em> productive at first, while Business B decides to use Java. Due to the nonlinear learning curve, there’s a much smaller pool of existing experts whose productivity with Tech A exceeds average productivity with Java. So Business A either needs to allocate a lot more capital to attracting such candidates, or they need to invest a lot of time in training programmers whose productivity is initially less than Java. So Business A has more up-front capital requirements and/or less velocity in the short term as they invest in training non-experts.</p>
<p>Here are some scenarios:</p>
<ul>
<li>If the project can be done at maximal velocity by 5 people using the best tools, the winning strategy is Business A. Even if the pool of experts is small, as long as Business A has sufficient capital to get just 5 of these experts to join, they start with higher productivity and continue to have higher productivity. In fact, this is probably among the best uses of capital for the business.</li>
<li>If the project can’t be done at maximal velocity given the number of people Business A can feasibly hire, then Business A is trading off short-term velocity (they have to train) in exchange for future productivity. Whether this is a winning strategy depends on whether the market being targeted has network effects and switching costs. If it has network effects and switching costs, being the initial leader is an advantage. Even if Business A later catches up to Business B in terms of functionality, if everyone is locked into Business B (whether explicitly or just via network effects), Business A can still lose!</li>
</ul>
<p>Bottom line is that even a business which is being purely rational about tech decisions has a lot of difficult-to-estimate factors to consider. How much communication overhead will there be with more programmers and more limited tech? How much worse will the technical debt be? How much are training costs? Rather than a clear victory strategy, we have difficult tradeoffs. Are companies like Google and Facebook making the right choice? Does anyone really know?</p>
<p>This leads me to a design principle which might be summarized:</p>
<blockquote>
<p>Design for experts; accomodate beginners</p>
</blockquote>
<p>That is, design powerful tools, but make the learning curve as linear as possible, and try to match or exceed the productivity of less powerful alternatives as soon as possible in the learning timeline. That is, it shouldn’t take 6 months to match the productivity of more limited alternatives. The goal here is to eliminate a situation in which we are tempted to settle for more limited technology as a hack to (possibly!) improve adoption. For example:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> easy-powerful-tech-1
| /
| /
| /
| /
| /
| /
| / &lt;- 10p
| /
|/
----------------------
time
easy-limited-tech
|
|
|
|
|
|
| ______ 10 p
|_/
|
---------
time
</code></pre>
</div>
<p>Here, <code class="highlighter-rouge">easy-powerful-tech-1</code> starts out with lower productivity than <code class="highlighter-rouge">easy-limited-tech</code>, but quickly exceeds it through linear productivity growth.</p>
<p>Another strategy that can work is to start out with <em>higher productivity</em> (often in the form of powerful features that can be used without deep understanding), and hit any nonlinear portion of the learning curve <em>after</em> already eclipsing more limited tech:</p>
<div class="highlighter-rouge"><pre class="highlight"><code> easy-powerful-tech-2
|
| /
| /
| /
| ____20p_______/
| /
|__/ 10p
|
|
----------------------
time
</code></pre>
</div>
<p>Notice that we have nonlinearity, but it’s past the point where we’re more productive than <code class="highlighter-rouge">easy-limited-tech</code>, so we’re still better suited for adoption!</p>
<h2 id="alternatives">Alternatives</h2>
<p>With this in mind, I’m going to give names to some alternative adoption strategies one sees in the wild:</p>
<p><em>Design for beginners (alienate experts):</em> Build technology that’s as approachable as possible for beginners, at the cost of <em>alienating experts</em>. While you win on adoption in the short term, in the long term, your beginners become experts and start to grow frustrated with your tool. Mindshare of experts starts migrating elsewhere, which is not good for competitiveness of your technology. At this point, it’s only switching costs and network effects, and the gradual influx of new learners that keep the technology around. Is this enough to ensure survival? Maybe, maybe not. As an example, consider <em>spreadsheets</em>, <a href="/2015-03-17/unison-update5.html#why-spreadsheets">a limited technology that is easy to learn</a>. Spreadsheets are a programming environment with poor capacity for abstraction or really any of the other tools programmers use to manage complexity (the ability to define new types, for instance). In the finance industry, spreadsheets get used pervasively for lots of interactive programming tasks and there are <em>expert beginners</em> who create extremely complex spreadsheets. In one sense, the results are impressive. And yet… it’s a bit like building a skyscraper out of toothpicks and marshmallows. Can it be done? Yes, perhaps. Is it impressive? Yes, in a way. But investing in learning and using better technology (like, say, steel) would have paid for itself many times over.</p>
<p>We’re now in a state where lots of people have recognized the problems with spreadsheets and there’s a cottage industry of companies competing to replace or augment spreadsheets with more powerful alternatives. It’s a difficult market though, because of just how deeply embedded spreadsheets are into organization workflows, and how high the switching costs and network effects now are.</p>
<p>Next up is <em>design for experts (alienate beginners):</em> Build technology that’s extremely powerful, giving no thought to how newcomers to the technology might become interested in it and come up to speed. On the one hand, the experts who bother to learn it are quite productive. On the other hand, the technology lacks attributes that facilitate anything other than niche following.</p>
<p>And there’s something else, a bit more subtle. This strategy tends to attract the wrong sort of people. People who are tacitly okay with technology being <em>needlessly difficult</em> for newcomers can tend to give off an unwelcoming vibe. Perhaps they view it as a sort of badge of honor or a proof of how smart they are that they’re able to deal with this difficulty and now get to use a technology that’s more productive. Perhaps they are even actively hostile to beginners. They make it personal and either directly or indirectly suggest people are stupid for not using their “expert-level” technology. It’s often a RTFM, “toughen up”, hazing culture, not a helpful one. Maybe the community leaders are unpleasant or rude, and no one seems to have much of a problem with it. The community starts to take a kind of pride in its niche status and acts more like a secret club. Members start to <em>like</em> being small and different from the mainstream. Any of this sounding familiar?</p>
<p>Ironically, these factors can actually <em>get in the way of building better technology</em>. When part of the appeal of using some technology is getting to feel like you’re part of your own little tribe, there’s often a tacit (or explicit) rejection or disinterest among community members in making technical changes that could eliminate needless difficulty and make the technology more accessible to “outsiders”. I’m not going to name names but you can probably think of lots of examples of this phenomenon…</p>
<h2 id="conclusion">Conclusion</h2>
<p>With new tools, adoption matters. And what drives adoption and flourishing isn’t always technical quality. If we want high-quality, powerful tech to flourish in the world, we can start by <em>hacking the learning curve</em>. Design for experts, but accomodate beginners by eliminating needless difficulty and incidental complexity. Provide <em>easy wins</em> that people can see and use without requiring deep understanding, but also provide hooks in the right places to guide beginners toward further learning.</p>
<p>This isn’t compromising on power or principles. And many experts can appreciate and get on board with these changes too. We <em>can</em> have it both ways. We just need to know how the world really works, and adapt our survival strategy accordingly.</p>
<p>Also see:</p>
<ul>
<li><a href="/2015-04-23/unison-update7.html#usability">What is usability?</a> dives into the concept of usability and usability comparisons.</li>
<li>Want specifics? The <a href="http://unisonweb.org">Unison project</a> is a new programming platform I’m working on that adopts this philosophy. Powerful programming models can be made much more accessible with a richer, more structured editor, and some radical assumptions that lets us solve <a href="http://unisonweb.org/2015-06-02/distributed-evaluation.html">big problems once, at the level of the whole platform</a>.</li>
<li><a href="/2014-10-23/learning.html">Good teaching makes long-term investments in the learner</a>, talks about the importance of investing in future productivity, not just the immediate task facing a learner.</li>
<li><a href="http://pchiusano.github.io/2015-03-17/unison-update5.html#why-spreadsheets">Why spreadsheets</a> explains why spreadsheets are so popular, despite being much-maligned by programmers.</li>
</ul>
Thu, 25 Feb 2016 00:00:00 +0000http://pchiusano.github.io/2016-02-25/tech-adoption.html
http://pchiusano.github.io/2016-02-25/tech-adoption.htmlWhen is it NOT preferable to specify your types first?<p>Very often when functional programming we specify the types first. Then once that’s done, we implement the term. Writing some tricky code? First, write out the type! By announcing (some) aspect of our intent to the compiler, we get an “accountability partner” that will verify we’ve remained true to our declared intent. That’s one part of it. But as Conor McBride likes to emphasize, types aren’t just about policing errors or checking:</p>
<ul>
<li>Haskell’s typeclasses are essentially a limited form of program inference, directed by type information. They automatically derive rather uninteresting plumbing code that would otherwise require manual specification by the programmer.</li>
<li>In better programming environments (like what many dependently-typed languages offer), type information we specify gets “pushed down” into our expression, and the editor uses it to <em>do more of our work for us</em> or <em>offer more helpful suggestions and information</em>.</li>
</ul>
<p>I’m going to call this mode of programming, of writing types first then specifying terms “top-down”. I like this modality, but here I want to discuss situations where it is preferable to specify terms “bottom-up”—by writing the term <em>first</em>, and letting <em>type inference</em> work to infer the type.</p>
<p><em>Sometimes the type is more information than the term it characterizes!</em></p>
<p>Specifying the type first, then specifying the “remaining information” left unspecified by the type, is a kind of compression scheme. Like any compression scheme, there will be inputs for which the “compressed representation” requires more bits than some more naive, direct encoding. Actually, the right way to think of this is that <em>all</em> ways of specifying a program are a compression scheme / encoding (including writing out the text of the term we want in a text editor), and there will never be a single way of specifying programs that is optimal for every case!</p>
<p>Thus, forcing the user to specify all type information first is sometimes less efficient than just letting them write the term they want, which they can check has a reasonable or expected inferred type.</p>
<p>Let’s see an amusing real-world example:</p>
<pre><code class="language-Haskell">wrapDomEvent' :: forall e event a (m :: * -&gt; *) t (h :: * -&gt; *).
(Reflex t, MonadSample t m, HasPostGui t h m,
Reflex.Host.Class.MonadReflexCreateTrigger t m, MonadIO m) =&gt;
(e -&gt; EventM.EventM event e () -&gt; IO (IO ()))
-&gt; EventM.EventM event e a -&gt; e -&gt; m (Event t a)
wrapDomEvent' onEvent getResult e = wrapDomEvent e onEvent getResult
</code></pre>
<p>That type is pretty horrifying, but it makes it obvious that having to specify that type to the compiler <em>first</em> is inefficient. Look how short the term is compared to the type! This actually happens quite a lot. You’re writing a function, and the implementation of the function is already known to you, perhaps because it’s very simple or because you arrived at it via other reasoning principles (see below). You now want to specify this term to the computer, using the minimum amount of information. Writing in a bottom-up style might well be minimal, as it is here.</p>
<p>Another interesting aspect of this example is you can imagine the thought process of the author: “I just want the <code class="highlighter-rouge">wrapDomEvent</code> function with the arguments in a different order”. Programmers use this style of reasoning quite a lot, even in typeful languages. Refactorings like switching the order of arguments, pulling a subexpression out into a let binding, abstracting a subterm into a function parameter, beta or eta-reducing an expression, and many other program transformations can be conceived of without really considering types at all.</p>
<p>Let’s dive into this a bit further. Suppose you’ve just written a function, specialized to some concrete types:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>wrangle :: Foo f =&gt; f [Employee] -&gt; Bar x
wrangle xs = ...
</code></pre>
</div>
<p>Later, you decide to abstract over one of the concrete functions being called in <code class="highlighter-rouge">wrangle</code>, and your implementation becomes:</p>
<div class="highlighter-rouge"><pre class="highlight"><code>wrangle raiseSalary xs = ...
</code></pre>
</div>
<p>I’ve actually omitted the type here, because you can imagine performing this refactoring without thinking primarily about how it affects the type of <code class="highlighter-rouge">wrangle</code>. And in fact, it might affect the type of <code class="highlighter-rouge">wrangle</code> in complex ways—perhaps <code class="highlighter-rouge">Foo</code> is no longer the constraint, and the return type is something other than <code class="highlighter-rouge">Bar</code>. But it’s sometimes easy to conceive of the refactoring (which can be done mechanically) without having to anticipate in advance (or specify to the compiler) exactly how the type of <code class="highlighter-rouge">wrangle</code> will be affected.</p>
<p>More generally, when you <em>abstract over parts of your implementation</em>, you aren’t necessarily reasoning primarily in terms of how this will affect types. You may be primarily reasoning in terms of <em>where you want information to be specified</em>. Abstracting over a concrete value being used in a term is a way of altering where that information is specified, and the programmer conceives of it primarily as such. Yes, moving around where information is specified affects the types of functions along the chain of dependencies, but this is actually <em>less interesting to the programmer</em> and not the focus of their attention.</p>
<p>Neither top-down nor bottom-up is superior. I see the ideal is a kind of conversation with the language editor. The human programmer offers some information, whatever information our limited brains can muster up to narrow the scope of possibilities. Perhaps it’s a type or type fragment. Perhaps it’s a term or term fragement. Perhaps it’s a more general query–“I know the program I want uses the function <code class="highlighter-rouge">foo</code> in some way, and <code class="highlighter-rouge">[Text]</code> appears in the type <em>somewhere</em>.”</p>
<p>The editor then kindly responds by doing <em>as much as possible with this information</em>, prompting the user to specify more where needed. “Here’s some existing functions that use <code class="highlighter-rouge">foo</code> somewhere and mention <code class="highlighter-rouge">[Text]</code>, is it one of these? There are hundreds more. Maybe you could tell me some further information…”</p>
Wed, 28 Oct 2015 00:00:00 +0000http://pchiusano.github.io/2015-10-28/top-down.html
http://pchiusano.github.io/2015-10-28/top-down.htmlUnison now open source and has its own site<p>I finally wrapped up most of the housecleaning I wanted to do before releasing the code. It’s <a href="https://github.com/unisonweb/platform">now public on GitHub</a> and there’s also a dedicated <a href="http://unisonweb.org/">project site and blog at unisonweb.org</a> and <a href="https://twitter.com/unisonweb">Twitter account</a>. Going forward, I’ll be posting about Unison from <a href="http://unisonweb.org/">unisonweb.org</a>, and any contributors to the project will also be able to use that space for Unison-related posts.</p>
<p>There’s currently a post up with a <a href="http://unisonweb.org/2015-05-07/update.html#post-start">status update and tentative roadmap</a>.</p>
<p>I’m definitely interested in <a href="http://unisonweb.org/2015-05-07/update.html#funding">finding funding to continue work on Unison</a>, but am not really sure what is realistic or possible. I’d love to hear in <a href="http://unisonweb.org/2015-05-07/update.html#disqus_thread">the comments</a> from folks who might have ideas or thoughts about this sort of thing.</p>
Fri, 08 May 2015 00:00:00 +0000http://pchiusano.github.io/2015-05-08/unison-update8.html
http://pchiusano.github.io/2015-05-08/unison-update8.htmlUnison update 7&#58; structured refactoring sessions<p>As I mentioned in <a href="/2015-01-30/unison-update6.html">update 6</a>, I’ve been spending the last few weeks doing some much-needed refactoring. Here’s an update on the progress:</p>
<ul>
<li>I’m about 3/4 of the way through converting the backend to use <a href="http://semantic-domain.blogspot.com/2015/03/abstract-binding-trees.html">abstract binding trees</a> (ABTs). This is working out super nicely. The code is cleaner and many of the “interesting” operations are ABT-generic, so I get to reuse lots of code for types, terms, and (later) type declarations. I expect this reuse to carry over when I add support to the editor for type and type declaration editing. ABTs also make it easy to add new binding forms like let bindings and pattern matching.</li>
<li>I rewrote a lot of the Unison editor code, which had gotten too ugly to extend further. The refactored version is pretty clean and easy to follow, however…</li>
</ul>
<h3 id="-elm-troubles"><a id="elm-troubles"></a> Elm troubles</h3>
<p>It’s become more clear that Elm isn’t working out very well as a tech choice for the Unison editor. I’ve started to consider use of Elm a placeholder until I decide on a replacement (or until Elm improves to where it is usable for my purposes). Maybe I’ll do a longer experience report later, but to summarize:</p>
<ul>
<li>Elm language limitations have become a real problem for me. The type system is F#-like: too limited to define <code class="highlighter-rouge">Applicative</code>, <code class="highlighter-rouge">Monad</code>, or any abstraction or data type that has a type parameter whose kind is not <code class="highlighter-rouge">*</code>. Also no higher-rank types or existential types. You can sometimes survive without these things, but eventually you reach a point where you need them. I’ve hit that point.</li>
<li>Elm’s version of FRP is not very expressive. Signals cannot be recursive or mutually recursive; the only way to introduce past-dependence is via a left fold (the <code class="highlighter-rouge">foldp</code> function) or by sending values to a sink and have them <a href="/2014-12-10/wormhole-antipattern.html">magically appear elsewhere in your program</a>. The oddly popular <a href="https://github.com/evancz/elm-architecture-tutorial">Elm architecture</a> is just a pattern for building up your entire app’s interactivity as the arguments to a left fold over the merged input events of your app! Because the signal world is so limited, most of your logic necessarily lives elsewhere. Not necessarily a bad thing, but the result has been that I haven’t gotten much mileage out of Elm’s version of FRP. Instead I’m doing the vast majority of the work with pure code that could easily be written in just about any other functional language.</li>
<li>There are some interesting alternatives to Elm that have arisen recently. I haven’t investigated them in detail, but there’s <a href="https://github.com/slamdata/purescript-halogen">Halogen</a> in PureScript, and for GHCJS there’s <a href="https://www.youtube.com/watch?v=mYvkcskJbc4">Reflex</a> and <a href="http://joelburget.com/react-haskell/">React-Haskell</a>. There are probably others. A GHCJS option interests me since it means I could share code between client and server. I look forward to exploring this further…
<ul>
<li>The only thing that seems missing from the alternatives is something like Elm’s <code class="highlighter-rouge">Element</code> static layout library, which I need for some of the things I’m doing. But it seems like a reasonable path forward might be to just write such a library or port Elm’s to whatever tech I end up using.</li>
</ul>
</li>
<li>At the moment, it doesn’t seem very likely that any of the issues I mention above will be addressed by Elm anytime soon. (Though I hope to be proven wrong!) It isn’t just a question of resources—Evan, the Elm creator, <a href="https://groups.google.com/d/msg/elm-discuss/oyrODCgYmQI/T2I_8L-AL6EJ">has stated</a> he’s very wary of adding “advanced” features that might scare off newcomers or change the culture around Elm. I can sort of understand where this is coming from—it is certainly true that at least in some people’s minds, Haskell has developed a reputation for being esoteric, hard to learn, and perhaps even unwelcoming or impractical. If this perception (right or wrong) stops some people from even walking in the door, then it can be good (for driving adoption) to try to address it. Unfortunately, Elm’s limitations means that in its current form, it’s not really the best match for my needs. I have stuff I need to accomplish, and Elm isn’t quite cutting it!</li>
</ul>
<h3 id="-how-to-define-usability"><a id="usability"> How to define “usability”?</a></h3>
<p>This got me thinking about the concept of usability of various tech tools like programming languages.</p>
<p>Here is a question: is a three-note keyboard <a href="http://lambda-the-ultimate.org/node/2654#comment-39872">more usable than a piano or a cello</a>? On the one hand, there’s less to learn; on the other hand, if a piece of music calls for a piano or cello, a three-note keyboard is not going to be usable at all!</p>
<p>At the same time, let’s consider the cello. Perhaps we could lessen the learning curve of cello by adding frets… but this comes at a cost of limiting vibrato, which is part of what makes the cello (or any bowed instrument) sound so beautiful! All right then, how about at least adding visual <em>markers</em> where the frets would be? That can only help, right? Not necessarily. Markers might lead learners to rely on visual cues, rather than (more rapid, accurate, scalable) use of the ear and muscle memory… but on the other hand, if a cello learner is temporarily aided by use of visual markers, and this helps them to persist in learning until they no longer need them, who can say that’s a bad thing?</p>
<p>As a further subtlety, there’s something of a <em>virtuoso culture</em> around instruments like piano and cello that have been around for a long time. The virtuoso culture prizes musicians not just (or even primarily) for their sensitive or thoughtful expression of music, it also emphasize pure technical mastery of the instrument. And this same culture values music not just (or even mostly) for its beauty, but also for how much the music facilitates flaunting of technical mastery. If we’re being honest, we must admit that these cultural elements have some impact on who chooses to learn music, and who chooses to stick with this learning.</p>
<p>The point is, these issues are complicated, and there aren’t really easy answers. And that’s part of why debates about these things in the tech world never seem to go anywhere. But I’d like to offer a helpful way of thinking about usability that’s <a href="/2015-03-27/unison-update6.html#technical-debt">analogous to some of the ideas I posted about technical debt</a>:</p>
<blockquote>
<p>… consider the choice between receiving $500 right now or a 60% chance of $2000 a year from now. How about a million dollars now vs a 60% chance of 3 million dollars a year from now? Of course, these choices have different expected values, but also different levels of risk. As in modern portfolio theory, there is no concept of the optimal portfolio, only the optimal portfolio for a given level of risk.</p>
</blockquote>
<p>When it comes to usability, there is no such thing as a tool which is optimally usable, we can only talk about optimal usability with respect to a level of expressiveness. That is, we can only make a given tool more usable by decreasing the amount of work it takes to accomplish the same thing, not by restricting capabilities. If we change the capabilities of the tool and make it more or less expressive, usability comparisons become meaningless. We are comparing apples to oranges. Neither artifact dominates the other, and it comes down to other preferences.</p>
<p>The reason the more expressive tool doesn’t strictly dominate the less expressive one is subtle. Yes, a tool which can do less (is inexpressive) requires less learning, and a tool with more capabilities (more expressive) requires more learning. We sometimes think of this extra learning and work as only being necessary if you happen to be doing something that requires the extra capabilities. But that’s not true. <em>This holds even for simple tasks that can in principle be addressed by either tool.</em> With the more expressive artifact, the user has to do work to figure out what subset of its capabilities should even be used, and how they should be used in concert to achieve the desired effect. Sometimes this amount of work is nontrivial, and it requires experience to do well. Choosing among several possible ways of doing something (some of which may not work out well at all) requires understanding the tradeoffs of these approaches. And this decision-making isn’t a one-time event, it’s a continuous process, occurring at multiple levels of granularity. We might say the user has a greater <em>burden of choice</em>.</p>
<p>With the less expressive artifact, there are fewer options and the decision of how to do something is often made for you, by someone who has some expertise and has tailored the defaults and limitations of their tool accordingly.</p>
<p>My point is that neither option dominates the other, it depends on many factors, including one’s experience and the time horizons of investment in using the artifact. Here are just a few examples:</p>
<ul>
<li>Can Haskell be used for client side development? Yes, but Haskell’s status as a more general purpose, more powerful language than Elm means there are more ways of doing this, still being actively explored and developed. A front-end programmer who has only worked with Javascript before might not be in a position to even evaluate these options! The Haskell community is also spread thinner, with lots of people working in different areas. Elm is more narrow in focus, in some ways more limited, but the language works for what it is and also perhaps provides a clearer path for beginners. I can’t argue that’s a bad thing.</li>
<li>On the other hand, a programmer expecting to be solving a certain class of problems for a long time may indeed benefit from using more powerful technology for that class of problems. Learning the extra expressiveness of more powerful technology thus becomes an investment in future productivity. With experience and practice, the cost of deciding on what subset of the tech to use drops to near zero. It becomes effortless to choose an approach, and the tool never gets in the way, regardless of the approach chosen.</li>
<li>Like any investment, there’s some uncertainty about the returns, and due to different levels of risk tolerance and so on people are going to make different choices.</li>
</ul>
<p>Now then. What are the implications of all this? Well, it means that there is tremendous value in finding ways to <em>decrease the burden of choice</em> when using more expressive technology. Here are some ways of doing that:</p>
<ul>
<li>Simply organizing into smaller sub-communities of more narrow focus can have a huge benefit. With more narrow focus, there’s less uncertainty about where to turn for help and hence more positive network effects.</li>
<li>Where there are multiple ways of achieving something, having coherent, well-organized information about suggested approaches (along with mention of other options and their tradeoffs) is extremely valuable, especially for newcomers. For instance, I may have some issues with <a href="https://github.com/evancz/elm-architecture-tutorial">the Elm architecture</a>, but it is a coherent, simply explained pattern that everyone can point to and which works fine for many cases.</li>
</ul>
<h3 id="-the-limitations-of-refactoring-via-modifying-text-files-in-place-and-what-to-do-instead"><a id="refactoring-sessions"></a> The limitations of refactoring via modifying text files <em>in place</em>, and what to do instead</h3>
<p>Here’s a common situation: you realize you need to make some changes to a data type used all over the place in your codebase. How do you go about doing it?</p>
<p><em>In the trivial case:</em> It’s something as simple as a implementation change (but no change to any types), or a renaming or other transformation that can be handled via a find/replace or even the (rather limited) automated refactoring capabilities of an IDE.</p>
<p><em>In the somewhat less trivial case:</em> You make the change you want, then go fix all the compile errors. Hopefully there aren’t too many, and if you’re making good use of static types, you can have quite a bit of confidence that once you’re done fixing the errors, the new codebase will still work. For many codebase transformations, this works perfectly fine, even if it is a bit tedious and mechanical. More on that later.</p>
<p><em>In the nontrivial case:</em> For many interesting cases of codebase transformations, simply making the change and fixing the errors doesn’t scale. You have to deal with an overwhelming list of errors, many of which are misleading, and the codebase ends up living in a non-compiling state for long periods of time. You begin to feel adrift in a sea of errors. Sometimes you’ll make a change, and the error count goes down. Other times, you’ll make a change, and it goes up. <em>Hmm, I was relatively sure that was the right change, but maybe not… I’m going to just hope that was correct, and the compiler is getting a bit further now.</em></p>
<p>What’s happened? You’re in a state where you are not necessarily getting meaningful, accurate feedback from the compiler. That’s bad for two reasons. Without this feedback, you may be writing code that is making things worse, not better, building further on faulty assumptions. But more than the technical difficulties, working in this state is <em>demoralizing</em>, and it kills <a href="/2015-03-27/unison-update6.html#technical-debt">focus and productivity</a>.</p>
<p>All right, so what do we do instead? Should we just avoid even considering any codebase transformations that are intractable with the “edit and fix errors” approach? No, that’s too conservative. Instead, we just have to <em>avoid modifying our program in place</em>. This lets us make absolutely any codebase transformation while keeping the codebase compiling at all times. Here’s a procedure, it’s quite simple:</p>
<ul>
<li>Suppose the file you wish to modify is <code class="highlighter-rouge">Foo.hs</code>. Create <code class="highlighter-rouge">Foo__2.hs</code> and call the module inside it <code class="highlighter-rouge">Foo__2</code> as well. Copy any over bits of code you want from <code class="highlighter-rouge">Foo.hs</code>, then make the changes you want and get <code class="highlighter-rouge">Foo__2</code> compiling. At this point, your codebase still compiles, but nothing is referencing the new definition of <code class="highlighter-rouge">Foo</code>.</li>
<li>Pick one of the modules which depends on <code class="highlighter-rouge">Foo.hs</code>. Let’s say <code class="highlighter-rouge">Bar.hs</code>. Create <code class="highlighter-rouge">Bar__2.hs</code> and call the module inside it <code class="highlighter-rouge">Bar__2</code> as well. You can probably see where this is going. You are going to have <code class="highlighter-rouge">Bar__2</code> <em>depend on the newly created</em> <code class="highlighter-rouge">Foo__2</code>. You can start by copying over the existing <code class="highlighter-rouge">Bar.hs</code>, but perhaps you want to copy over bits and pieces at a time and get them each to compile against <code class="highlighter-rouge">Foo__2</code>. Or maybe you just copy all of <code class="highlighter-rouge">Bar.hs</code> over at once and crank through the errors. Whatever makes it easiest for you, just get <code class="highlighter-rouge">Bar__2</code> compiling against <code class="highlighter-rouge">Foo__2</code>.
<ul>
<li><em>Note:</em> For languages that allow circular module dependencies, the cycle acts effectively like a single module. The strategy of copying over bits at a time works well for this. And while you’re at it, how about breaking up those cycles!</li>
</ul>
</li>
<li>Now that you’re done with <code class="highlighter-rouge">Bar__2.hs</code>, pick another module which depends on either <code class="highlighter-rouge">Foo</code> or <code class="highlighter-rouge">Bar</code> and follow the same procedure. Continue doing this until you’ve updated all the <em>transitive dependents</em> of <code class="highlighter-rouge">Foo</code>. You might end up with a lot of <code class="highlighter-rouge">__2</code>-suffixed copies of files, some of which might be quite similar to their old state, and some of which might be quite different. Perhaps some modules have been made obsolete or unnecessary. In any case, if you’ve updated all the transitive dependents of your initial change, you’re ready for the final step.</li>
<li>For any file which has a corresponding <code class="highlighter-rouge">__2</code> file, delete the original, and rename the <code class="highlighter-rouge">Foo__2.hs</code> to <code class="highlighter-rouge">Foo.hs</code>, and so on. Also do a recursive find/replace in the text of all files, replacing <code class="highlighter-rouge">__2</code> with nothing. (Obviously, you don’t need to use <code class="highlighter-rouge">__2</code>, any prefix or suffix that is unique and unused will do fine.)</li>
<li>Voilà! Your codebase now compiles with all the changes.</li>
</ul>
<p><em>Note</em>: I’m not claiming this is a new idea. Programmers do something like this all the time for large changes.</p>
<p>Notice that at each step, you are only dealing with errors from at most a single module and you are never confronted with a massive list of errors, many of which might be misleading or covering up <em>more</em> errors. Progress on the refactoring is measured not by the number of errors (which might not be accurate anyway), but by the number of modules updated vs the total number of modules in the set of transitive dependents of the immediate change(s). For those who like burndown charts and that sort of thing, you may want to compute this set up front and track progress as a percentage accordingly.</p>
<p>What happens if we take this good idea to its logical conclusion is we end up with a model in which the codebase is represented as a purely functional data type. (In fact, the refactoring algorithm I gave above might remind you of how a functional data structure like a tree gets “modified”—we produce a new tree and the old tree sticks around, immutable, as long as we keep a reference to it.) So in this model, we never modify a definition in place, causing other code to break. When we modify some code, we are creating a new version, referenced by no one. It is up to us to then propagate that change to the transitive dependents of the old code.</p>
<p>This is the model adopted by Unison. All terms, types, and type declarations are uniquely identified by a nameless, content-based hash. In the editor, when you reference the symbol <code class="highlighter-rouge">identity</code>, you immediately resolve that to some hash, and it is the <em>hash</em>, not the name, which is stored in the syntax tree. The hash will always and forever reference the same term. We can create new terms, perhaps even based on the old term, but these will have different content and hence different hashes. We can change the name associated with a hash, but that just affects how the term is displayed, not how it behaves! And if we call something else <code class="highlighter-rouge">identity</code> (there’s no restriction of name uniqueness), all references continue to point to the previous definition. Refactoring is hence a purely functional transformation from one codebase to another.</p>
<p><em>Aside:</em> One lovely consequence of this model is that incremental typechecking is trivial. Since definitions never change, we can cache the type associated with a hash, and just look it up when doing typechecking. Simple!</p>
<p>It’s a very pretty model, but it raises questions. We <em>do</em> sometimes want to make changes to some aspect of the codebase and its transitive dependents. Alright, so we aren’t going to literally modify the code <em>in place</em>, but we do still need to have a convenient way of creating a whole bunch of related new definitions, based in part of the old definitions. How do we do that?</p>
<p>What an interesting problem! I’ve talked to several people about it, and there’s also some interesting research on this sort of thing. Though I’m not sure what the exact form will take, it all seems very solvable. Just to sketch out some ideas:</p>
<ul>
<li>Trivial refactorings remain trivial. Renaming is as simple as updating the metadata associated with a hash, in a single location in the code datastore! Modifying a definition in a way that preserves its type (or gives the result a subtype of the old type - for instance going from <code class="highlighter-rouge">[Int] -&gt; [Int]</code> to <code class="highlighter-rouge">[a] -&gt; [a]</code>) is also trivial. Just find all the places that reference the old hash, and point them to the new hash, transitively.</li>
<li>More interesting refactorings can be organized into highly structured <em>refactoring sessions</em>. As a simple example, suppose you have a function <code class="highlighter-rouge">foo x y = blah (g 42) (x+x)</code>. You decide that rather than hardcoding <code class="highlighter-rouge">42</code> you’d like to abstract over that value, so your definition becomes <code class="highlighter-rouge">foo x y z = blah (g z) ..</code> This change now generates a set of <em>obligations</em>. The editor now walks you through the transitive dependents of <code class="highlighter-rouge">foo</code> (out to whatever scope you want), and you have the option of either <em>binding</em> the additional parameter or <em>propagating</em> the parameter out further. Perhaps you do just want to bind 42 everywhere in the existing code, and the extra abstraction is just for some new code you’re about to write. There can be an easy way to do this kind of bulk acceptance. Or perhaps you have some way of conjuring up a number from other types that are usually in scope wherever <code class="highlighter-rouge">foo</code> is used. You write a function to do this once, include it as part of the session, and then reuse it (with approval) in lots of places. Of course, the UX for this is all TBT, but the point is that it’s a very structured, guided activity, and there’s lots of opportunities to reuse work. How many times have you worked through a rather mechanical refactoring, doing essentially the same thing over and over, with no real opportunities for reuse due to limitations of applying the refactoring via a process of text munging!</li>
<li>The sessions will themselves be represented as Unison terms, and will have a unique hash. You can start 5 sessions, bookmark them, work on them concurrently, abandon any that grow too big in scope, and so on. Metrics like “remaining transitive updates” and so forth can be computed automatically.</li>
<li>In this world, refactoring becomes something fun, reliable, and automated wherever possible.</li>
</ul>
<p>Such exciting possibilities! I look forward to exploring them further, hopefully with help from some of you! When I get through this latest refactoring, I’ll feel like the code is in pretty decent shape and will be releasing it publicly. I look forward to developing Unison more out in the open, in collaboration with other folks who as inspired by this project as I am.</p>
Thu, 23 Apr 2015 00:00:00 +0000http://pchiusano.github.io/2015-04-23/unison-update7.html
http://pchiusano.github.io/2015-04-23/unison-update7.htmlUnison update 6&#58; refactoring, technical debt, and motivation<p>No new screencasts to show this week. I’m in the middle of doing some much-needed refactoring. What happened? As of <a href="/2015-03-17/unison-update5.html">the last update</a>, I had a decent <em>expression</em> editor. The missing final piece was adding a declaration layer to the editor, allowing a Unison panel to be edited much like a module in a regular programming language.</p>
<p>Unfortunately, when I started working on this, I found I just couldn’t take another step. All the crappy Elm code I’d written out of a singleminded desire to maintain <em>velocity, velocity, velocity,</em> had finally caught up to me. I’d been ignoring what turned out to be some bad architecture choices I’d made early on and the code was getting uglier and uglier with each feature I added (never a good sign). The thought of adding yet more ugly code to the pile seemed pointless and unmotivating. It was time to pay down the technical debt.</p>
<p>So I’m currently in the process of rewriting parts of the editor. I’ve discovered a nicer way of organizing my Elm code that avoids some of the problems with <a href="https://github.com/evancz/elm-architecture-tutorial">the Elm architecture</a>. I’ll write that up in detail in a separate post.</p>
<p>There’s also one other thing I have planned which is to change the language representation to use <a href="http://semantic-domain.blogspot.co.uk/2015/03/abstract-binding-trees.html">abstract binding trees</a> (ABTs). At the moment, the Unison language has only one binding form, lambda, and I am using manual (untyped) De Bruijn indices. Besides being error prone (I have to remember to weaken variables in all the right places), this doesn’t scale very well to adding other forms of binding like pattern matching and let bindings. ABTs scale well to arbitrary binding forms, and for my use case they are a better fit than a <a href="https://hackage.haskell.org/package/bound">bound</a>-like approach. I’ll also try to do a writeup of this at some point.</p>
<p>Until then, there probably won’t be much (visible) progress, but I hope to have some more interesting updates in a couple weeks.</p>
<h3 id="-when-does-it-make-sense-to-pay-down-technical-debt"><a id="technical-debt"></a> When does it make sense to pay down technical debt?</h3>
<p>This all got me thinking a bit more about the concept of technical debt. Given the choice of paying back technical debt or adding more features and functionality, which makes the most sense? That depends. In paying back debt, you (definitely) lose some velocity in the short term in exchange for an (uncertain, but presumed) longer-term increase in velocity when the code is in better shape. In effect, you are making an investment in future productivity. Whether this investment is rational very much depends on the <a href="http://en.wikipedia.org/wiki/Time_value_of_money">time value of the new functionality</a>, one’s risk tolerance, and on the expected likelihood that the refactoring will indeed increase future productivity. Best to illustrate with some examples:</p>
<ul>
<li>A startup about to run out of money looking to raise another round of funding very soon may rationally decide to accumulate more technical debt in order to get some features out the door in the window before potential funders decide whether to invest. For them at this time, there is a huge cost to any short-term delays in velocity. They can make investments in future productivity <em>after</em> getting funding, and their doors are still open.</li>
<li>An established company may also decide not to invest in paying down technical debt, because of uncertainty around whether or not these investments will in fact lead to future productivity. A programmer tells you (or you tell yourself) that refactoring or rewriting some code will increase future productivity, but there is always uncertainty around this, typically moreso than just building on what’s already there. A decisionmaker with less risk tolerance may reasonably choose the surer bet. They are not exactly being irrational in doing so. As another example, consider the choice between receiving $500 right now or a 60% chance of $2000 a year from now. How about a million dollars now vs a 60% chance of 3 million dollars a year from now? Of course, these choices have different <em>expected</em> values, but also different levels of risk. As in <a href="http://en.wikipedia.org/wiki/Modern_portfolio_theory">modern portfolio theory</a>, there is no concept of <em>the optimal portfolio</em>, only the optimal portfolio <em>for a given level of risk</em>.
<ul>
<li>What tends to happen to very conservative decisionmakers is that they wait until the code has gotten so bad that the uncertainty around whether paying down the debt is justified drops below their risk tolerance. But like other forms of debt, technical debt accumulates interest—it is usually less work to pay it down early, and more work the longer you wait. Thus, perpetually delaying leads to accumulating debt and productivity on the codebase crawls to a standstill. Features that ‘seem simple’ now take weeks, months, or years. Eventually it becomes rational to simply declare bankruptcy (rewrite the code). Projects rarely have the courage or capacity to do this, but eventually a competitor will.</li>
</ul>
</li>
<li>A well-run startup or established business is planning to stick around. Investments in future productivity are thus critical to growth and survival, and any short term accumulations of technical debt will generally be paid quickly, lest they accumulate interest and hamper future productivity.</li>
</ul>
<p>Of course, the people making decisions about these things aren’t usually so rational about it. But it’s always good to have a decisionmaking framework that lets you explore aspects of a decision in a more methodical way.</p>
<p>… except it’s not that simple. There’s one factor I’ve completely ignored, and that is programmer motivation. Working on code you know is shitty is demoralizing. I would say that different programmers can tolerate different amounts of technical debt before it substantially affects motivation, but it is definitely true that working on high-quality code is exciting and empowering.</p>
<p>Pretending that engineers are emotionless machine parts that operate with the same efficiency no matter where they are plugged in doesn’t change the reality that programmers are human beings. Even if we were purely interested in the productivity of the overall business, decisions <em>must</em> factor in the happiness and motivation of the people doing the work. When you factor this in, the full costs of accumulating technical debt become more apparent:</p>
<ul>
<li>Programmers lose motivation and productivity drops, well beyond the level merely caused by the technical debt itself.</li>
<li>Hiring and turnover becomes a problem. Programmers start leaving, and perhaps the company even develops a reputation externally for having a bad codebase. The company may need to pay higher salaries to attract and retain candidates of the same quality.</li>
<li>If you’re an open source project, it becomes impossible to recruit volunteers. Who wants to work on a crappy codebase in their free time? I think this is part of the reason why open source projects tend to be higher quality in many ways—unlike a business, an open source project can’t get away with just paying people money to put up with bad code, they actually have to make the codebase something people will enjoy working on.</li>
</ul>
<p>Also see <a href="/2015-03-26/type-errors.html">I hate type errors!</a>, which talks more about programmer motivation.</p>
Fri, 27 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-27/unison-update6.html
http://pchiusano.github.io/2015-03-27/unison-update6.htmlI hate type errors!<p>I hate type errors!</p>
<p>Even though the Unison editor isn’t yet suitable for real programming, my experience using it has made me even more aware of just how much time programmers waste dealing with type errors:</p>
<ul>
<li>Let’s say 95% of errors are trivial to fix. You probably don’t even bother reading the message. A quick look at the line number and ten seconds of thinking and you know the fix. Flow is maintained. Your focus stays on what matters—making the actual edits to the code. You’ve made an edit, and the compiler has helpfully pointed out a missed or incorrect step, which you immediately rectify. This is why people like type systems.</li>
<li>In 5% of the cases, the error message(s) are misleading and/or point to the wrong location(s). You take a look, scratch your head. 30 seconds go by; it becomes clear you aren’t going to intuit this one. Now you actually have to gather more information. Okay, maybe you’ll start by reading that error message more carefully. Is the information useful? Maybe. Maybe you try a change, use typed holes to gather info about the types of values in scope, etc. But recognize that you are now in full-on yak-shaving mode. You started off by conceiving of some edits to your code. You then made these edits, but now look—you aren’t productively fixing your mistakes or providing missing steps, you are engaged in <em>a very inefficient search process to narrow down and understand what the problems even are.</em></li>
</ul>
<p>Note that if we deferred our typing errors until runtime as in a dynamic language, we wouldn’t have to spend <em>any</em> time deciphering type errors, and the runtime errors would now be in terms of actual program values that we can inspect, rather than a possibly inaccurate <em>symbolic description</em> of the problem. Of course I feel this is throwing the baby out with the bath water—static types are overall a huge win especially in languages like Haskell with nice type systems. But I point this out because I think it’s important to develop some perspective.</p>
<p>When you start out doing typeful programming, perhaps that 95% is more like 60%. A very large number of error messages you get from the compiler are frustratingly opaque. What the hell is the compiler even talking about? Some people in this situation get frustrated enough that they leave the typeful programming world in favor of some dynamic language. But if you stick with it, over time, with experience, you get better at intuiting what the issues are or avoiding them in the first place, and the 60% creeps up to 90%, 95%, 96%, 97%… But here is the problem: in getting to this point, you have brought the cost of deciphering errors down to acceptable levels but also lost perspective on the cumulative costs. The past costs are forgotten and the present costs are ingrained in your workflow and thinking, a productivity tax you don’t even really notice or think about.</p>
<p><em>Aside:</em> I consider this a problem with our industry and education system. We are raising generations of programmers implicitly taught to accept everything as a <em>given</em>, no matter how arcane or costly. But in software, nothing is a given, every single aspect of the software world is the result of <a href="http://www.gurteen.com/gurteen/gurteen.nsf/id/no-smarter-than-you"><em>choices</em> made by people “no smarter than you”</a>.</p>
<p>But even for people with experience, the costs of deciphering type errors is much higher than that 5% would indicate. The issue is not the <em>direct</em> costs due to time wasted deciphering type errors, it’s the <em>indirect</em> costs. Programming is an activity that demands focus. When focus is maintained, it is possible to accomplish tasks in hours that might otherwise take weeks. Maintaining a state of flow for longer periods of time is an almost trancelike experience, and incredibly empowering. One feels attuned to the tasks at hand, subtasks get pushed and popped from the stack, and there is a feeling of directness, control, and creative energy.</p>
<p>But maintaining this level of focus is rare. We don’t like to talk about it, but if we are being appropriately humble about our limitations, we must recognize that human focus is a very scarce and very fragile resource. Our puny little brains can only allocate so many resources, and us programmers are constantly losing focus. I don’t mean obvious distractions like wasting time checking Twitter or Reddit when you should be doing something. These things are often the symptom, rather than the disease. What I mean is the act of doing things that <em>aren’t the actions most likely to be most productive at furthing progress</em>. Everyone knows what that feels like. Perhaps you are debugging, and rather than methodically narrowing down the problem via a set of well-chosen experiments each giving maximal information, you instead meander around, perhaps adding random print statements or inspecting random values in the debugger. When you should step back and go for a walk perhaps, you instead stare at the screen. You’ve lost focus, but before you realize it and do something productive about it, you’ve lost an hour.</p>
<p>Deciphering type errors leads to exactly this same sort of inefficient, meandering, unfocused mode of work, perhaps on a smaller scale. But the costs are still there, and they are substantial, if you pay attention.</p>
<p>Also see: <a href="/2014-09-30/punchcard-era.html">Why are we still programming like it’s the punchcard era?</a></p>
Thu, 26 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-26/type-errors.html
http://pchiusano.github.io/2015-03-26/type-errors.html