Paul Chiusano's bloghttp://pchiusano.github.io
Unison now open source and has its own site<p>I finally wrapped up most of the housecleaning I wanted to do before releasing the code. It’s <a href="https://github.com/unisonweb/platform">now public on GitHub</a> and there’s also a dedicated <a href="http://unisonweb.org/">project site and blog at unisonweb.org</a> and <a href="https://twitter.com/unisonweb">Twitter account</a>. Going forward, I’ll be posting about Unison from <a href="http://unisonweb.org/">unisonweb.org</a>, and any contributors to the project will also be able to use that space for Unison-related posts.</p>
<p>There’s currently a post up with a <a href="http://unisonweb.org/2015-05-07/update.html#post-start">status update and tentative roadmap</a>.</p>
<p>I’m definitely interested in <a href="http://unisonweb.org/2015-05-07/update.html#funding">finding funding to continue work on Unison</a>, but am not really sure what is realistic or possible. I’d love to hear in <a href="http://unisonweb.org/2015-05-07/update.html#disqus_thread">the comments</a> from folks who might have ideas or thoughts about this sort of thing.</p>
Fri, 08 May 2015 00:00:00 +0000http://pchiusano.github.io/2015-05-08/unison-update8.html
http://pchiusano.github.io/2015-05-08/unison-update8.htmlUnison update 7&#58; structured refactoring sessions<p>As I mentioned in <a href="/2015-01-30/unison-update6.html">update 6</a>, I’ve been spending the last few weeks doing some much-needed refactoring. Here’s an update on the progress:</p>
<ul>
<li>I’m about 3/4 of the way through converting the backend to use <a href="http://semantic-domain.blogspot.com/2015/03/abstract-binding-trees.html">abstract binding trees</a> (ABTs). This is working out super nicely. The code is cleaner and many of the “interesting” operations are ABT-generic, so I get to reuse lots of code for types, terms, and (later) type declarations. I expect this reuse to carry over when I add support to the editor for type and type declaration editing. ABTs also make it easy to add new binding forms like let bindings and pattern matching.</li>
<li>I rewrote a lot of the Unison editor code, which had gotten too ugly to extend further. The refactored version is pretty clean and easy to follow, however…</li>
</ul>
<h3 id="a-idelm-troubles-elm-troubles"><a id="elm-troubles"></a> Elm troubles</h3>
<p>It’s become more clear that Elm isn’t working out very well as a tech choice for the Unison editor. I’ve started to consider use of Elm a placeholder until I decide on a replacement (or until Elm improves to where it is usable for my purposes). Maybe I’ll do a longer experience report later, but to summarize:</p>
<ul>
<li>Elm language limitations have become a real problem for me. The type system is F#-like: too limited to define <code>Applicative</code>, <code>Monad</code>, or any abstraction or data type that has a type parameter whose kind is not <code>*</code>. Also no higher-rank types or existential types. You can sometimes survive without these things, but eventually you reach a point where you need them. I’ve hit that point.</li>
<li>Elm’s version of FRP is not very expressive. Signals cannot be recursive or mutually recursive; the only way to introduce past-dependence is via a left fold (the <code>foldp</code> function) or by sending values to a sink and have them <a href="/2014-12-10/wormhole-antipattern.html">magically appear elsewhere in your program</a>. The oddly popular <a href="https://github.com/evancz/elm-architecture-tutorial">Elm architecture</a> is just a pattern for building up your entire app’s interactivity as the arguments to a left fold over the merged input events of your app! Because the signal world is so limited, most of your logic necessarily lives elsewhere. Not necessarily a bad thing, but the result has been that I haven’t gotten much mileage out of Elm’s version of FRP. Instead I’m doing the vast majority of the work with pure code that could easily be written in just about any other functional language.</li>
<li>There are some interesting alternatives to Elm that have arisen recently. I haven’t investigated them in detail, but there’s <a href="https://github.com/slamdata/purescript-halogen">Halogen</a> in PureScript, and for GHCJS there’s <a href="https://www.youtube.com/watch?v=mYvkcskJbc4">Reflex</a> and <a href="http://joelburget.com/react-haskell/">React-Haskell</a>. There are probably others. A GHCJS option interests me since it means I could share code between client and server. I look forward to exploring this further…
<ul>
<li>The only thing that seems missing from the alternatives is something like Elm’s <code>Element</code> static layout library, which I need for some of the things I’m doing. But it seems like a reasonable path forward might be to just write such a library or port Elm’s to whatever tech I end up using.</li>
</ul>
</li>
<li>At the moment, it doesn’t seem very likely that any of the issues I mention above will be addressed by Elm anytime soon. (Though I hope to be proven wrong!) It isn’t just a question of resources—Evan, the Elm creator, <a href="https://groups.google.com/d/msg/elm-discuss/oyrODCgYmQI/T2I_8L-AL6EJ">has stated</a> he’s very wary of adding “advanced” features that might scare off newcomers or change the culture around Elm. I can sort of understand where this is coming from—it is certainly true that at least in some people’s minds, Haskell has developed a reputation for being esoteric, hard to learn, and perhaps even unwelcoming or impractical. If this perception (right or wrong) stops some people from even walking in the door, then it can be good (for driving adoption) to try to address it. Unfortunately, Elm’s limitations means that in its current form, it’s not really the best match for my needs. I have stuff I need to accomplish, and Elm isn’t quite cutting it!</li>
</ul>
<h3 id="a-idusability-how-to-define-usability"><a id="usability"> How to define “usability”?</a></h3>
<p>This got me thinking about the concept of usability of various tech tools like programming languages.</p>
<p>Here is a question: is a three-note keyboard <a href="http://lambda-the-ultimate.org/node/2654#comment-39872">more usable than a piano or a cello</a>? On the one hand, there’s less to learn; on the other hand, if a piece of music calls for a piano or cello, a three-note keyboard is not going to be usable at all!</p>
<p>At the same time, let’s consider the cello. Perhaps we could lessen the learning curve of cello by adding frets… but this comes at a cost of limiting vibrato, which is part of what makes the cello (or any bowed instrument) sound so beautiful! All right then, how about at least adding visual <em>markers</em> where the frets would be? That can only help, right? Not necessarily. Markers might lead learners to rely on visual cues, rather than (more rapid, accurate, scalable) use of the ear and muscle memory… but on the other hand, if a cello learner is temporarily aided by use of visual markers, and this helps them to persist in learning until they no longer need them, who can say that’s a bad thing?</p>
<p>As a further subtlety, there’s something of a <em>virtuoso culture</em> around instruments like piano and cello that have been around for a long time. The virtuoso culture prizes musicians not just (or even primarily) for their sensitive or thoughtful expression of music, it also emphasize pure technical mastery of the instrument. And this same culture values music not just (or even mostly) for its beauty, but also for how much the music facilitates flaunting of technical mastery. If we’re being honest, we must admit that these cultural elements have some impact on who chooses to learn music, and who chooses to stick with this learning.</p>
<p>The point is, these issues are complicated, and there aren’t really easy answers. And that’s part of why debates about these things in the tech world never seem to go anywhere. But I’d like to offer a helpful way of thinking about usability that’s <a href="/2015-03-27/unison-update6.html#technical-debt">analogous to some of the ideas I posted about technical debt</a>:</p>
<blockquote>
<p>… consider the choice between receiving $500 right now or a 60% chance of $2000 a year from now. How about a million dollars now vs a 60% chance of 3 million dollars a year from now? Of course, these choices have different expected values, but also different levels of risk. As in modern portfolio theory, there is no concept of the optimal portfolio, only the optimal portfolio for a given level of risk.</p>
</blockquote>
<p>When it comes to usability, there is no such thing as a tool which is optimally usable, we can only talk about optimal usability with respect to a level of expressiveness. That is, we can only make a given tool more usable by decreasing the amount of work it takes to accomplish the same thing, not by restricting capabilities. If we change the capabilities of the tool and make it more or less expressive, usability comparisons become meaningless. We are comparing apples to oranges. Neither artifact dominates the other, and it comes down to other preferences.</p>
<p>The reason the more expressive tool doesn’t strictly dominate the less expressive one is subtle. Yes, a tool which can do less (is inexpressive) requires less learning, and a tool with more capabilities (more expressive) requires more learning. We sometimes think of this extra learning and work as only being necessary if you happen to be doing something that requires the extra capabilities. But that’s not true. <em>This holds even for simple tasks that can in principle be addressed by either tool.</em> With the more expressive artifact, the user has to do work to figure out what subset of its capabilities should even be used, and how they should be used in concert to achieve the desired effect. Sometimes this amount of work is nontrivial, and it requires experience to do well. Choosing among several possible ways of doing something (some of which may not work out well at all) requires understanding the tradeoffs of these approaches. And this decision-making isn’t a one-time event, it’s a continuous process, occurring at multiple levels of granularity. We might say the user has a greater <em>burden of choice</em>.</p>
<p>With the less expressive artifact, there are fewer options and the decision of how to do something is often made for you, by someone who has some expertise and has tailored the defaults and limitations of their tool accordingly.</p>
<p>My point is that neither option dominates the other, it depends on many factors, including one’s experience and the time horizons of investment in using the artifact. Here are just a few examples:</p>
<ul>
<li>Can Haskell be used for client side development? Yes, but Haskell’s status as a more general purpose, more powerful language than Elm means there are more ways of doing this, still being actively explored and developed. A front-end programmer who has only worked with Javascript before might not be in a position to even evaluate these options! The Haskell community is also spread thinner, with lots of people working in different areas. Elm is more narrow in focus, in some ways more limited, but the language works for what it is and also perhaps provides a clearer path for beginners. I can’t argue that’s a bad thing.</li>
<li>On the other hand, a programmer expecting to be solving a certain class of problems for a long time may indeed benefit from using more powerful technology for that class of problems. Learning the extra expressiveness of more powerful technology thus becomes an investment in future productivity. With experience and practice, the cost of deciding on what subset of the tech to use drops to near zero. It becomes effortless to choose an approach, and the tool never gets in the way, regardless of the approach chosen.</li>
<li>Like any investment, there’s some uncertainty about the returns, and due to different levels of risk tolerance and so on people are going to make different choices.</li>
</ul>
<p>Now then. What are the implications of all this? Well, it means that there is tremendous value in finding ways to <em>decrease the burden of choice</em> when using more expressive technology. Here are some ways of doing that:</p>
<ul>
<li>Simply organizing into smaller sub-communities of more narrow focus can have a huge benefit. With more narrow focus, there’s less uncertainty about where to turn for help and hence more positive network effects.</li>
<li>Where there are multiple ways of achieving something, having coherent, well-organized information about suggested approaches (along with mention of other options and their tradeoffs) is extremely valuable, especially for newcomers. For instance, I may have some issues with <a href="https://github.com/evancz/elm-architecture-tutorial">the Elm architecture</a>, but it is a coherent, simply explained pattern that everyone can point to and which works fine for many cases.</li>
</ul>
<h3 id="a-idrefactoring-sessions-the-limitations-of-refactoring-via-modifying-text-files-in-place-and-what-to-do-instead"><a id="refactoring-sessions"></a> The limitations of refactoring via modifying text files <em>in place</em>, and what to do instead</h3>
<p>Here’s a common situation: you realize you need to make some changes to a data type used all over the place in your codebase. How do you go about doing it?</p>
<p><em>In the trivial case:</em> It’s something as simple as a implementation change (but no change to any types), or a renaming or other transformation that can be handled via a find/replace or even the (rather limited) automated refactoring capabilities of an IDE.</p>
<p><em>In the somewhat less trivial case:</em> You make the change you want, then go fix all the compile errors. Hopefully there aren’t too many, and if you’re making good use of static types, you can have quite a bit of confidence that once you’re done fixing the errors, the new codebase will still work. For many codebase transformations, this works perfectly fine, even if it is a bit tedious and mechanical. More on that later.</p>
<p><em>In the nontrivial case:</em> For many interesting cases of codebase transformations, simply making the change and fixing the errors doesn’t scale. You have to deal with an overwhelming list of errors, many of which are misleading, and the codebase ends up living in a non-compiling state for long periods of time. You begin to feel adrift in a sea of errors. Sometimes you’ll make a change, and the error count goes down. Other times, you’ll make a change, and it goes up. <em>Hmm, I was relatively sure that was the right change, but maybe not… I’m going to just hope that was correct, and the compiler is getting a bit further now.</em></p>
<p>What’s happened? You’re in a state where you are not necessarily getting meaningful, accurate feedback from the compiler. That’s bad for two reasons. Without this feedback, you may be writing code that is making things worse, not better, building further on faulty assumptions. But more than the technical difficulties, working in this state is <em>demoralizing</em>, and it kills <a href="/2015-03-27/unison-update6.html#technical-debt">focus and productivity</a>.</p>
<p>All right, so what do we do instead? Should we just avoid even considering any codebase transformations that are intractable with the “edit and fix errors” approach? No, that’s too conservative. Instead, we just have to <em>avoid modifying our program in place</em>. This lets us make absolutely any codebase transformation while keeping the codebase compiling at all times. Here’s a procedure, it’s quite simple:</p>
<ul>
<li>Suppose the file you wish to modify is <code>Foo.hs</code>. Create <code>Foo__2.hs</code> and call the module inside it <code>Foo__2</code> as well. Copy any over bits of code you want from <code>Foo.hs</code>, then make the changes you want and get <code>Foo__2</code> compiling. At this point, your codebase still compiles, but nothing is referencing the new definition of <code>Foo</code>.</li>
<li>Pick one of the modules which depends on <code>Foo.hs</code>. Let’s say <code>Bar.hs</code>. Create <code>Bar__2.hs</code> and call the module inside it <code>Bar__2</code> as well. You can probably see where this is going. You are going to have <code>Bar__2</code> <em>depend on the newly created</em> <code>Foo__2</code>. You can start by copying over the existing <code>Bar.hs</code>, but perhaps you want to copy over bits and pieces at a time and get them each to compile against <code>Foo__2</code>. Or maybe you just copy all of <code>Bar.hs</code> over at once and crank through the errors. Whatever makes it easiest for you, just get <code>Bar__2</code> compiling against <code>Foo__2</code>.
<ul>
<li><em>Note:</em> For languages that allow circular module dependencies, the cycle acts effectively like a single module. The strategy of copying over bits at a time works well for this. And while you’re at it, how about breaking up those cycles!</li>
</ul>
</li>
<li>Now that you’re done with <code>Bar__2.hs</code>, pick another module which depends on either <code>Foo</code> or <code>Bar</code> and follow the same procedure. Continue doing this until you’ve updated all the <em>transitive dependents</em> of <code>Foo</code>. You might end up with a lot of <code>__2</code>-suffixed copies of files, some of which might be quite similar to their old state, and some of which might be quite different. Perhaps some modules have been made obsolete or unnecessary. In any case, if you’ve updated all the transitive dependents of your initial change, you’re ready for the final step.</li>
<li>For any file which has a corresponding <code>__2</code> file, delete the original, and rename the <code>Foo__2.hs</code> to <code>Foo.hs</code>, and so on. Also do a recursive find/replace in the text of all files, replacing <code>__2</code> with nothing. (Obviously, you don’t need to use <code>__2</code>, any prefix or suffix that is unique and unused will do fine.)</li>
<li>Voilà! Your codebase now compiles with all the changes.</li>
</ul>
<p><em>Note</em>: I’m not claiming this is a new idea. Programmers do something like this all the time for large changes.</p>
<p>Notice that at each step, you are only dealing with errors from at most a single module and you are never confronted with a massive list of errors, many of which might be misleading or covering up <em>more</em> errors. Progress on the refactoring is measured not by the number of errors (which might not be accurate anyway), but by the number of modules updated vs the total number of modules in the set of transitive dependents of the immediate change(s). For those who like burndown charts and that sort of thing, you may want to compute this set up front and track progress as a percentage accordingly.</p>
<p>What happens if we take this good idea to its logical conclusion is we end up with a model in which the codebase is represented as a purely functional data type. (In fact, the refactoring algorithm I gave above might remind you of how a functional data structure like a tree gets “modified”—we produce a new tree and the old tree sticks around, immutable, as long as we keep a reference to it.) So in this model, we never modify a definition in place, causing other code to break. When we modify some code, we are creating a new version, referenced by no one. It is up to us to then propagate that change to the transitive dependents of the old code.</p>
<p>This is the model adopted by Unison. All terms, types, and type declarations are uniquely identified by a nameless, content-based hash. In the editor, when you reference the symbol <code>identity</code>, you immediately resolve that to some hash, and it is the <em>hash</em>, not the name, which is stored in the syntax tree. The hash will always and forever reference the same term. We can create new terms, perhaps even based on the old term, but these will have different content and hence different hashes. We can change the name associated with a hash, but that just affects how the term is displayed, not how it behaves! And if we call something else <code>identity</code> (there’s no restriction of name uniqueness), all references continue to point to the previous definition. Refactoring is hence a purely functional transformation from one codebase to another.</p>
<p><em>Aside:</em> One lovely consequence of this model is that incremental typechecking is trivial. Since definitions never change, we can cache the type associated with a hash, and just look it up when doing typechecking. Simple!</p>
<p>It’s a very pretty model, but it raises questions. We <em>do</em> sometimes want to make changes to some aspect of the codebase and its transitive dependents. Alright, so we aren’t going to literally modify the code <em>in place</em>, but we do still need to have a convenient way of creating a whole bunch of related new definitions, based in part of the old definitions. How do we do that?</p>
<p>What an interesting problem! I’ve talked to several people about it, and there’s also some interesting research on this sort of thing. Though I’m not sure what the exact form will take, it all seems very solvable. Just to sketch out some ideas:</p>
<ul>
<li>Trivial refactorings remain trivial. Renaming is as simple as updating the metadata associated with a hash, in a single location in the code datastore! Modifying a definition in a way that preserves its type (or gives the result a subtype of the old type - for instance going from <code>[Int] -&gt; [Int]</code> to <code>[a] -&gt; [a]</code>) is also trivial. Just find all the places that reference the old hash, and point them to the new hash, transitively.</li>
<li>More interesting refactorings can be organized into highly structured <em>refactoring sessions</em>. As a simple example, suppose you have a function <code>foo x y = blah (g 42) (x+x)</code>. You decide that rather than hardcoding <code>42</code> you’d like to abstract over that value, so your definition becomes <code>foo x y z = blah (g z) ..</code> This change now generates a set of <em>obligations</em>. The editor now walks you through the transitive dependents of <code>foo</code> (out to whatever scope you want), and you have the option of either <em>binding</em> the additional parameter or <em>propagating</em> the parameter out further. Perhaps you do just want to bind 42 everywhere in the existing code, and the extra abstraction is just for some new code you’re about to write. There can be an easy way to do this kind of bulk acceptance. Or perhaps you have some way of conjuring up a number from other types that are usually in scope wherever <code>foo</code> is used. You write a function to do this once, include it as part of the session, and then reuse it (with approval) in lots of places. Of course, the UX for this is all TBT, but the point is that it’s a very structured, guided activity, and there’s lots of opportunities to reuse work. How many times have you worked through a rather mechanical refactoring, doing essentially the same thing over and over, with no real opportunities for reuse due to limitations of applying the refactoring via a process of text munging!</li>
<li>The sessions will themselves be represented as Unison terms, and will have a unique hash. You can start 5 sessions, bookmark them, work on them concurrently, abandon any that grow too big in scope, and so on. Metrics like “remaining transitive updates” and so forth can be computed automatically.</li>
<li>In this world, refactoring becomes something fun, reliable, and automated wherever possible.</li>
</ul>
<p>Such exciting possibilities! I look forward to exploring them further, hopefully with help from some of you! When I get through this latest refactoring, I’ll feel like the code is in pretty decent shape and will be releasing it publicly. I look forward to developing Unison more out in the open, in collaboration with other folks who as inspired by this project as I am.</p>
Thu, 23 Apr 2015 00:00:00 +0000http://pchiusano.github.io/2015-04-23/unison-update7.html
http://pchiusano.github.io/2015-04-23/unison-update7.htmlUnison update 6&#58; refactoring, technical debt, and motivation<p>No new screencasts to show this week. I’m in the middle of doing some much-needed refactoring. What happened? As of <a href="/2015-03-17/unison-update5.html">the last update</a>, I had a decent <em>expression</em> editor. The missing final piece was adding a declaration layer to the editor, allowing a Unison panel to be edited much like a module in a regular programming language.</p>
<p>Unfortunately, when I started working on this, I found I just couldn’t take another step. All the crappy Elm code I’d written out of a singleminded desire to maintain <em>velocity, velocity, velocity,</em> had finally caught up to me. I’d been ignoring what turned out to be some bad architecture choices I’d made early on and the code was getting uglier and uglier with each feature I added (never a good sign). The thought of adding yet more ugly code to the pile seemed pointless and unmotivating. It was time to pay down the technical debt.</p>
<p>So I’m currently in the process of rewriting parts of the editor. I’ve discovered a nicer way of organizing my Elm code that avoids some of the problems with <a href="https://github.com/evancz/elm-architecture-tutorial">the Elm architecture</a>. I’ll write that up in detail in a separate post.</p>
<p>There’s also one other thing I have planned which is to change the language representation to use <a href="http://semantic-domain.blogspot.co.uk/2015/03/abstract-binding-trees.html">abstract binding trees</a> (ABTs). At the moment, the Unison language has only one binding form, lambda, and I am using manual (untyped) De Bruijn indices. Besides being error prone (I have to remember to weaken variables in all the right places), this doesn’t scale very well to adding other forms of binding like pattern matching and let bindings. ABTs scale well to arbitrary binding forms, and for my use case they are a better fit than a <a href="https://hackage.haskell.org/package/bound">bound</a>-like approach. I’ll also try to do a writeup of this at some point.</p>
<p>Until then, there probably won’t be much (visible) progress, but I hope to have some more interesting updates in a couple weeks.</p>
<h3 id="a-idtechnical-debt-when-does-it-make-sense-to-pay-down-technical-debt"><a id="technical-debt"></a> When does it make sense to pay down technical debt?</h3>
<p>This all got me thinking a bit more about the concept of technical debt. Given the choice of paying back technical debt or adding more features and functionality, which makes the most sense? That depends. In paying back debt, you (definitely) lose some velocity in the short term in exchange for an (uncertain, but presumed) longer-term increase in velocity when the code is in better shape. In effect, you are making an investment in future productivity. Whether this investment is rational very much depends on the <a href="http://en.wikipedia.org/wiki/Time_value_of_money">time value of the new functionality</a>, one’s risk tolerance, and on the expected likelihood that the refactoring will indeed increase future productivity. Best to illustrate with some examples:</p>
<ul>
<li>A startup about to run out of money looking to raise another round of funding very soon may rationally decide to accumulate more technical debt in order to get some features out the door in the window before potential funders decide whether to invest. For them at this time, there is a huge cost to any short-term delays in velocity. They can make investments in future productivity <em>after</em> getting funding, and their doors are still open.</li>
<li>An established company may also decide not to invest in paying down technical debt, because of uncertainty around whether or not these investments will in fact lead to future productivity. A programmer tells you (or you tell yourself) that refactoring or rewriting some code will increase future productivity, but there is always uncertainty around this, typically moreso than just building on what’s already there. A decisionmaker with less risk tolerance may reasonably choose the surer bet. They are not exactly being irrational in doing so. As another example, consider the choice between receiving $500 right now or a 60% chance of $2000 a year from now. How about a million dollars now vs a 60% chance of 3 million dollars a year from now? Of course, these choices have different <em>expected</em> values, but also different levels of risk. As in <a href="http://en.wikipedia.org/wiki/Modern_portfolio_theory">modern portfolio theory</a>, there is no concept of <em>the optimal portfolio</em>, only the optimal portfolio <em>for a given level of risk</em>.
<ul>
<li>What tends to happen to very conservative decisionmakers is that they wait until the code has gotten so bad that the uncertainty around whether paying down the debt is justified drops below their risk tolerance. But like other forms of debt, technical debt accumulates interest—it is usually less work to pay it down early, and more work the longer you wait. Thus, perpetually delaying leads to accumulating debt and productivity on the codebase crawls to a standstill. Features that ‘seem simple’ now take weeks, months, or years. Eventually it becomes rational to simply declare bankruptcy (rewrite the code). Projects rarely have the courage or capacity to do this, but eventually a competitor will.</li>
</ul>
</li>
<li>A well-run startup or established business is planning to stick around. Investments in future productivity are thus critical to growth and survival, and any short term accumulations of technical debt will generally be paid quickly, lest they accumulate interest and hamper future productivity.</li>
</ul>
<p>Of course, the people making decisions about these things aren’t usually so rational about it. But it’s always good to have a decisionmaking framework that lets you explore aspects of a decision in a more methodical way.</p>
<p>… except it’s not that simple. There’s one factor I’ve completely ignored, and that is programmer motivation. Working on code you know is shitty is demoralizing. I would say that different programmers can tolerate different amounts of technical debt before it substantially affects motivation, but it is definitely true that working on high-quality code is exciting and empowering.</p>
<p>Pretending that engineers are emotionless machine parts that operate with the same efficiency no matter where they are plugged in doesn’t change the reality that programmers are human beings. Even if we were purely interested in the productivity of the overall business, decisions <em>must</em> factor in the happiness and motivation of the people doing the work. When you factor this in, the full costs of accumulating technical debt become more apparent:</p>
<ul>
<li>Programmers lose motivation and productivity drops, well beyond the level merely caused by the technical debt itself.</li>
<li>Hiring and turnover becomes a problem. Programmers start leaving, and perhaps the company even develops a reputation externally for having a bad codebase. The company may need to pay higher salaries to attract and retain candidates of the same quality.</li>
<li>If you’re an open source project, it becomes impossible to recruit volunteers. Who wants to work on a crappy codebase in their free time? I think this is part of the reason why open source projects tend to be higher quality in many ways—unlike a business, an open source project can’t get away with just paying people money to put up with bad code, they actually have to make the codebase something people will enjoy working on.</li>
</ul>
<p>Also see <a href="/2015-03-26/type-errors.html">I hate type errors!</a>, which talks more about programmer motivation.</p>
Fri, 27 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-27/unison-update6.html
http://pchiusano.github.io/2015-03-27/unison-update6.htmlI hate type errors!<p>I hate type errors!</p>
<p>Even though the Unison editor isn’t yet suitable for real programming, my experience using it has made me even more aware of just how much time programmers waste dealing with type errors:</p>
<ul>
<li>Let’s say 95% of errors are trivial to fix. You probably don’t even bother reading the message. A quick look at the line number and ten seconds of thinking and you know the fix. Flow is maintained. Your focus stays on what matters—making the actual edits to the code. You’ve made an edit, and the compiler has helpfully pointed out a missed or incorrect step, which you immediately rectify. This is why people like type systems.</li>
<li>In 5% of the cases, the error message(s) are misleading and/or point to the wrong location(s). You take a look, scratch your head. 30 seconds go by; it becomes clear you aren’t going to intuit this one. Now you actually have to gather more information. Okay, maybe you’ll start by reading that error message more carefully. Is the information useful? Maybe. Maybe you try a change, use typed holes to gather info about the types of values in scope, etc. But recognize that you are now in full-on yak-shaving mode. You started off by conceiving of some edits to your code. You then made these edits, but now look—you aren’t productively fixing your mistakes or providing missing steps, you are engaged in <em>a very inefficient search process to narrow down and understand what the problems even are.</em></li>
</ul>
<p>Note that if we deferred our typing errors until runtime as in a dynamic language, we wouldn’t have to spend <em>any</em> time deciphering type errors, and the runtime errors would now be in terms of actual program values that we can inspect, rather than a possibly inaccurate <em>symbolic description</em> of the problem. Of course I feel this is throwing the baby out with the bath water—static types are overall a huge win especially in languages like Haskell with nice type systems. But I point this out because I think it’s important to develop some perspective.</p>
<p>When you start out doing typeful programming, perhaps that 95% is more like 60%. A very large number of error messages you get from the compiler are frustratingly opaque. What the hell is the compiler even talking about? Some people in this situation get frustrated enough that they leave the typeful programming world in favor of some dynamic language. But if you stick with it, over time, with experience, you get better at intuiting what the issues are or avoiding them in the first place, and the 60% creeps up to 90%, 95%, 96%, 97%… But here is the problem: in getting to this point, you have brought the cost of deciphering errors down to acceptable levels but also lost perspective on the cumulative costs. The past costs are forgotten and the present costs are ingrained in your workflow and thinking, a productivity tax you don’t even really notice or think about.</p>
<p><em>Aside:</em> I consider this a problem with our industry and education system. We are raising generations of programmers implicitly taught to accept everything as a <em>given</em>, no matter how arcane or costly. But in software, nothing is a given, every single aspect of the software world is the result of <a href="http://www.gurteen.com/gurteen/gurteen.nsf/id/no-smarter-than-you"><em>choices</em> made by people “no smarter than you”</a>.</p>
<p>But even for people with experience, the costs of deciphering type errors is much higher than that 5% would indicate. The issue is not the <em>direct</em> costs due to time wasted deciphering type errors, it’s the <em>indirect</em> costs. Programming is an activity that demands focus. When focus is maintained, it is possible to accomplish tasks in hours that might otherwise take weeks. Maintaining a state of flow for longer periods of time is an almost trancelike experience, and incredibly empowering. One feels attuned to the tasks at hand, subtasks get pushed and popped from the stack, and there is a feeling of directness, control, and creative energy.</p>
<p>But maintaining this level of focus is rare. We don’t like to talk about it, but if we are being appropriately humble about our limitations, we must recognize that human focus is a very scarce and very fragile resource. Our puny little brains can only allocate so many resources, and us programmers are constantly losing focus. I don’t mean obvious distractions like wasting time checking Twitter or Reddit when you should be doing something. These things are often the symptom, rather than the disease. What I mean is the act of doing things that <em>aren’t the actions most likely to be most productive at furthing progress</em>. Everyone knows what that feels like. Perhaps you are debugging, and rather than methodically narrowing down the problem via a set of well-chosen experiments each giving maximal information, you instead meander around, perhaps adding random print statements or inspecting random values in the debugger. When you should step back and go for a walk perhaps, you instead stare at the screen. You’ve lost focus, but before you realize it and do something productive about it, you’ve lost an hour.</p>
<p>Deciphering type errors leads to exactly this same sort of inefficient, meandering, unfocused mode of work, perhaps on a smaller scale. But the costs are still there, and they are substantial, if you pay attention.</p>
<p>Also see: <a href="/2014-09-30/punchcard-era.html">Why are we still programming like it’s the punchcard era?</a></p>
Thu, 26 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-26/type-errors.html
http://pchiusano.github.io/2015-03-26/type-errors.htmlUnison update 5&#58; a better spreadsheet<p>Here’s a video of the latest progress. In this video, I create the term <code>view reactive (1 + 1)</code>, which displays in the editor as <code>2</code>. This might not seem like much, but the ability to define reactive views like this is the first step in allowing Unison panels to be used much like a spreadsheet, where the user fills in values and other parts of the panel are updated in response. There’s also a few other things shown which I’ll talk through below:</p>
<iframe src="/resources/unison/unison-update5-movie.html" width="485" height="520" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
<h3 id="whats-going-on">What’s going on?</h3>
<p>I’ve added a number of builtin functions to the editor. The video makes use of the following functions:</p>
<pre><code class="language-Haskell">data View a -- builtin data type
view : View a -&gt; a -&gt; a
source : View a
reactive : View a
-- of course, there can be many other View values
</code></pre>
<p>The <code>view</code> function evaluates to its second argument at runtime, but its presence can be used to annotate the syntax tree to override how subtrees are rendered. The first argument is of type <code>View a</code> (capital ‘V’), and the second argument is of type <code>a</code>. You’ll notice in the video that after the <code>1 + 1</code> has been filled in for the second argument, the admissible type for the first argument is now <code>View Number</code>.</p>
<p>The editor interprets <code>view source blah</code> as the usual source rendering of <code>blah</code>. It interprets <code>view reactive blah</code> in the same way, but first evaluates <code>blah</code> to normal form. Also shown in the video:</p>
<ul>
<li>We can toggle between the ‘raw’ and interpreted view by typing ‘v’ when a node is highlighted.</li>
<li>Ill-typed terms are still shown in the explorer, along with their types, but are unselectable. You can see this when I tried to write <code>swatch</code>, having type <code>View Color</code>, after the context demanded a <code>View Number</code>.</li>
</ul>
<h3 id="a-better-spreadsheet-and-a-better-application-framework">A better spreadsheet, and a better application framework</h3>
<p>Now I’ll make a controversial claim, which is that the ability to define ‘reactive’ values like this puts us on our way to making the Unison editor a richer, more powerful replacement for spreadsheets. We’re missing a few things, namely a declaration layer, which lets us define and edit more than just a single expression, and also a richer layout library. But with these things in place, this reactivity gives us many of the tools we need to reproduce modern ‘applications’ with a Unison panel. Obviously, at this early stage, the Excel and Google sheets developers shouldn’t be too worried, but with time, the ability of Unison to replicate functionality of these applications will grow quite rapidly.</p>
<p><a id="why-spreadsheets"></a>Programmers like to scoff at spreadsheets, but they’re extremely popular among nonprogrammers. Why is that? Here are just a few reasons:</p>
<ul>
<li>Spreadsheets have a largely nonsymbolic representation of programs. The only symbolic representation used is the simple expression language, which people can easily learn as consists only of constants and function application. Of course, every spreadsheet program has some ad hoc language for defining new formula, but that’s outside the usual spreadsheet model and gets used very rarely.</li>
<li>We can think of the spreadsheet as a limited kind of application-building toolkit. It supports a form of interactivity, in that we may select a location for editing, and updates occur in response to these edits (conceptually no different than any other application, be it Photoshop or Twitter). There’s no separate step of ‘running’ the program and we’re encouraged to consider the UI as just a way of viewing a program, rather than a separate artifact <em>produced by</em> a program. See <a href="/2014-11-13/program-as-ui.html">viewing a program as a UI</a>.</li>
<li>The user can easily construct functions that manipulate sequences in an inductive fashion, considering only concrete inputs. For instance, the cell <code>A2</code> takes the value <code>= A1 * A3 + B$7</code>, and the user drags to extend the definition to build a whole sequence. Of course, a programmer has no trouble generalizing this sort of thing and just writing a step function that uses symbolic inputs, but it requires less learning to do things the spreadsheet way.</li>
<li>Lastly, and this might seem dumb, but a 2D grid as a template is actually a surprisingly nice starting point for a lot of computing tasks and layouts. In terms of approachability, it is much easier to modify an existing template than it is to start with a blank canvas and create entirely new content. A blank canvas requires that the user <em>have a vision</em> of what they want to create, which they execute on by understanding the means of abstraction and means of combination. In comparison, the ‘modify a template’ mode of spreadsheets lets the user get away with just reacting to what’s on the screen. Bret Victor talks about some of this in <a href="http://worrydream.com/LearnableProgramming/">Learnable Programming</a>. I won’t argue here that one modality is inherently better than the other, but part of the appeal of spreadsheets is they do support the ‘create by reacting’ modality better than traditional programming.</li>
</ul>
<p>In conjunction with the above points, we also have the network effects of large numbers of people using spreadsheets for similar tasks. A culture develops around the tool, such that spreadsheets become “the way things are done”. This is something that often happens in the evolution of technology. A technology takes hold because it has features that give it a smaller learning curve and wider appeal. Possible alternatives are displaced, but very often, the winning technology has very poor characteristics <em>beyond those that give it an advantage in growing adoption.</em></p>
<p>As an example, in the finance industry where I used to work, spreadsheets get used for <em>everything</em> and often grow horribly complex. The problem is that while spreadsheets are approachable and require very little learning curve, they’re also quite crippled:</p>
<ul>
<li>They are entirely first-order. There is basically no support for abstraction (though all spreadsheet programs have some ad hoc language for defining new formula), and certainly no ability to define higher-order functions. Without the ability to define new abstractions, it’s impossible to manage complexity. The way people reuse spreadsheets is largely by copying, pasting, and modifying. Naturally, spreadsheet apps have some grown ad hoc ways of linking between spreadsheets, but this is all a very very poor substitute for a real programming language.</li>
<li>The forms of interactivity spreadsheets do support is too limited. Why can’t I render a numeric cell as a slider that I can move back and forth to see things update instantly, for instance? And even when some form of interactivity is supported, it’s usually in some ad hoc, unguessable way, rather than something unified and obvious.</li>
<li>The 2D grid which seems like a helpful starting point actually becomes quite annoying for more complex spreadsheets. I’m just declaring a collection of variables, why the heck do I have to worry about where they are positioned? Why the heck am I refering to values by <em>position</em> rather than by some meaningful name?? Also, the use of a grid often leads to lots of futzing around to deal with unwanted interaction between row heights and and column widths for logically unrelated parts of the layout.</li>
</ul>
<p>With the Unison editor model, we can cover all the advantages of spreadsheets while providing a much more powerful programming model, and a much richer layout library. I’ll talk more about where this is going in a later post.</p>
<h3 id="technical-note-avoiding-the-need-for-impredicative-instantiation-when-searching-for-functions">Technical note: avoiding the need for impredicative instantiation when searching for functions</h3>
<p>As I mentioned in <a href="/2015-02-23-unison-update3">update 3</a>, the type shown in bold when the explorer is opened is the <em>admissible type</em>, which must be a subtype of whatever value the user fills in the explorer. So of course <code>Number</code> is a subtype of <code>Number</code>, and <code>forall a . a -&gt; [a]</code> is a subtype of <code>forall a . a -&gt; [a]</code>, but also <code>forall a . a</code> is a subtype of <code>Number</code>. When the explorer pops up, and in response to the user typing, we need to find all terms in the store which are a supertype of the admissible type, and it’s actually important that type-based search be <em>perfectly accurate</em>. Unlike Hoogle, we can’t get away with returning some ill-typed results that match some processed version of the admissible type (this would allow creation of ill-typed expressions), and we also can’t get away with an overly strict matcher that elides some results that do match (this could result in the user getting stuck, unable to fill in a value that should typecheck).</p>
<p>The problem is that naively doing a subtyping check is going to be overly strict. The admissible type <code>forall a . a</code> (shown in the editor as just <code>a</code>, implicitly quantified) is only a subtype of <code>forall a . a -&gt; a</code> (the type of the identity function) if we allow impedicative instantiation of the type variable <code>a</code>. Yikes! Rather than open that can of worms, I chose to sidestep the problem. Conceptually, the admissible type is quite useful to display to the user as it lets them know what sort of terms are valid replacements for the node they’ve selected. So the explorer can work in two phases—first, we use a processed version of it as a search query, with the constraint that the search algorithm must <em>at least</em> return any terms that could be supertypes of the admissible type, even at some impredicative instantiation. This is very easy.</p>
<p>Once we have this conservative upper bound on the set of possible matches, we then try replacing each result in the expression being edited, ensuring that the result is well-typed. The replacement in context doesn’t require any impredicativity, it’s just normal typechecking. Anything passing this second level of checks is a valid replacement, and can be shown to the user as a completion in the explorer. So long as the first check phase cuts down the search space to a reasonable number of possible matches, these checks can be done very quickly.</p>
Tue, 17 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-17/unison-update5.html
http://pchiusano.github.io/2015-03-17/unison-update5.htmlUnison update 4&#58; more editor interactions<p>Here’s a video of the latest progress. Watch me write the expression <code>1 + 1</code>, then evaluate it!! Further explanation below.</p>
<iframe src="/resources/unison/unison-update4-movie.html" width="420" height="570" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
<p>There are a number of additional editor interactions shown here:</p>
<ul>
<li>In the explorer, typing a space followed by an operator accepts the current completion and opens a fresh explorer with the operator prefilled. You can see this in the video, where <code>1 +</code> becomes <code>(+) 1 _</code>.</li>
<li>In the explorer, typing two spaces accepts the current completion and moves the explorer to the right. You can see this in the video.</li>
<li>Typing ‘s’ <em>steps</em> (linking + 1 beta reduction) the selected expression.</li>
<li>Typing ‘e’ <em>evaluates</em> the selected expression (to weak head normal form).</li>
<li>Global search is now enabled by default in the explorer. In the video, I look up the <code>identity</code> function.</li>
<li>Notice that the completions are aware of the admissible type. When we go to fill in the second argument to <code>+</code>, we see the fully saturated version of <code>(+)</code>, applied to two arguments, <code>_ + _</code>.</li>
</ul>
<p>Not shown:</p>
<ul>
<li>Typing ‘a’ applies a function to the selected expression, so <code>42</code> becomes <code>_ 42</code>, and the focus moves to the <code>_</code>.</li>
<li>Typing ‘r’ does one eta-reduction, so <code>x -&gt; f x</code> becomes <code>f</code>.</li>
<li>Terms which are ill-typed but which match the search query are displayed in the explorer along with their types (but are unselectable of course).</li>
<li>Builtin functions for concatenting strings.</li>
<li>Stepping works fine under lambdas. For instance, we can simplify the <code>1 + 1</code> in <code>x -&gt; f (1 + 1)</code> to <code>x -&gt; f 2</code>.</li>
</ul>
<p>I think there’s a lot more work that could be done to improve fluidity, but already, it feels pretty nice!</p>
<h4 id="a-idremarks-remarks"><a id="remarks"></a> Remarks</h4>
<p>When the explorer shows terms which are ill-typed but which match the query, this is the closest we get to a ‘compile error’ when writing code in Unison. When your compiler is run in batch mode, the question of what information to display in the compile errors is very important. What if we miss giving some information that’s relevant to fixing the error? Thus compilers like GHC will often dump out a screenful of text for even very simple errors. There’s also entire lines of research devoted to the question of how to minimize error messages, make them maximally informative (see type error slicing), and so on.</p>
<p>In Unison, we don’t need to decide what information is relevant in an “error message” because the user can explore their program interactively. You are shown the type of your search result, the admissible type, the current type, and the types of all local variables, which is often enough to make it clear how to modify your code. But if you desire more information, you can navigate around to get the types of any related expressions you care about, and this can be done very quickly. The question of ‘what information to display in the compile error’ becomes irrelevant! There are no compile errors, and the user can extract the information they deem relevant <em>interactively</em>.</p>
<p>Here’s a question: should the name of the identity function be <code>id</code> or <code>identity</code>? Maybe you have some opinions about this. In Unison, we can actually store <em>both</em> names in the metadata associated with the term, and the search will return a term if any of the names match the search query. Regardless of our choice, it’s not an additional <em>typing</em> burden to have a longer name—the programmer can type <code>id</code> followed by two spaces to accept <code>identity</code> (or <code>id</code>, <code>&lt;enter&gt;</code>, etc). Name resolution is always type-directed, and the user specifies the minimum information necessary to disambiguate.</p>
<p>In working on Unison, I’ve found it’s very interesting to think about the distinction between a <em>language feature</em> vs an <em>editor feature</em>. We often conflate the two. The syntax for “imports”, name resolution, etc, are all considered part of the language. In a semantic editor, many of these concerns are better dealt with by the editor, which can incidentally be much more flexible (allowing multiple names for a given term is just one instance of this). Raw text program editing has been holding us back!</p>
<p>That’s all for now, more updates soon!</p>
Wed, 04 Mar 2015 00:00:00 +0000http://pchiusano.github.io/2015-03-04/unison-update4.html
http://pchiusano.github.io/2015-03-04/unison-update4.htmlKeep your data types dumb, layer on properties after the fact<p>When programming in a language with a nice type system, you often have the option of defining ‘smart’ data types which bake some invariant in as an index of the data type. But sometimes, it can be better to keep your data types dumb, and layer on invariants <em>after the fact</em>. The lesson generalizes, but I’ll show an example—well-formed lambda calculus terms:</p>
<pre><code class="language-Haskell">data Slot a = Bound | Free a -- equivalent to `Maybe`
data Expr a -- `a` is the type of free variables in this expression
= Unit
| Ap (Expr a) (Expr a)
| Lam (Expr (Slot a)) -- this is key
| Var a
var1 :: Expr (Slot a)
var1 = Var Bound
identity :: Expr a
identity = Lam var1
</code></pre>
<p>This is an example of the ‘baked in’ variety of data type. <code>Expr a</code> is polymorphically recursive—when we descend into the <code>Lam</code> constructor, the type of free variables is wrapped in another <code>Slot</code>.</p>
<p>What’s nice about this approach is that whether an expression has free variables, and what their ‘depth’ is, is now information tracked in the type. Functions like capture-avoiding substitution can be given the very precise signature:</p>
<pre><code class="language-Haskell">bind :: Expr a -&gt; (a -&gt; Expr b) -&gt; Expr b -- Yep, `Expr` is also a monad!
</code></pre>
<p>Short of doing something extremely silly like duplicating a random subterm, it is almost impossible to implement this function incorrectly in a way that typechecks.</p>
<p><strong>Aside:</strong> We can give <code>Monad Expr</code>, with <code>return = Var</code>, and <code>(&gt;&gt;=) = bind</code>! See <a href="https://hackage.haskell.org/package/bound">the bound library</a>.</p>
<p>Sounds great, right? We get some very strong static guarantees and lots of help from the typechecker in writing our code. We can’t, for instance, accidentally introduce a subterm with an ill-formed free variable reference. </p>
<p>But the situation is more nuanced. Whenever we bake some property into a data type, we’re now obligated to articulate to the compiler precisely how this property is affected <em>by all functions which manipulate the data type</em>. Sometimes these proofs will be easy (for instance, <code>bind</code> is quite natural, the ‘proof’ just falls out of the obvious implementation); in other cases, the proof may require serious contortions or restructuring of our programs. Sometimes, the simplest implementation of a function is one which temporarily breaks an invariant, then restores it, but these implementations can become <em>more painful</em> when the invariant is baked in. Are the benefits worth it? The answer: <em>it depends</em>. How likely are the bugs being prevented by additional typing, what is the expected cost of finding and fixing such bugs, and to what extent is the extra typing information helpful in guiding development?</p>
<p><strong>Aside:</strong> To be clear, I am very glad that people are working on making such proofs more convenient to specify, and I encourage this good work!!</p>
<h4 id="the-best-of-both-worlds">The best of both worlds</h4>
<p>Rather than making an ‘all or nothing’ decision about whether to track typing information, we can sometimes get away with using a dumb underlying representation, with a <em>typed wrapper</em>:</p>
<pre><code class="language-Haskell">data Expr
= Unit
| Ap Expr Expr
| Lam Expr
| Var Int -- debruijn index
newtype Scoped a = Scoped { unscope :: Expr } -- constructor hidden
-- Only a closed expression may be promoted to `Scoped`
scoped :: Expr -&gt; Maybe (Scoped a)
scoped = ...
var1 :: Scoped (Slot a)
var1 = Var 1
app :: Scoped a -&gt; Scoped a -&gt; Scoped a
app (Scoped f) (Scoped x) = Scoped (Ap f x)
lam :: Scoped (Slot a) -&gt; Scoped a
lam (Scoped body) = Scoped (Lam body)
weaken :: Scoped a -&gt; Scoped (Slot a)
weaken (Scoped s) = Scoped (go 1 s)
where
go depth e = case e of
Var v -&gt; if v &gt;= depth then Var (v + 1) else e
App f arg -&gt; App (go depth f) (go depth arg)
Lam body -&gt; Lam (go (depth + 1) body)
Unit -&gt; Unit
-- etc
</code></pre>
<p>We can use view patterns and pattern synonyms to recover most or all of the syntax from the previous ‘baked in’ data type.</p>
<p>The bottom line; we don’t have to choose all or nothing. For usages where we want and are will to pay the cost of static guarantees, we can use the indexed version of our type, <code>Scoped</code>. For places where we decide this isn’t worth it, we can continue to use the raw, dumb data type.</p>
Wed, 25 Feb 2015 00:00:00 +0000http://pchiusano.github.io/2015-02-25/dumb-data.html
http://pchiusano.github.io/2015-02-25/dumb-data.htmlUnison update 3&colon; connecting the editor to the node (continued)<p>I didn’t get a chance to put together a post on Friday, but I made some decent progress. Here’s a short recording of an editing session:</p>
<iframe src="/resources/unison/unison-explorer.html" width="530" height="446" frameborder="0" webkitallowfullscreen="" mozallowfullscreen="" allowfullscreen=""></iframe>
<p>A few notes about what’s happening here:</p>
<ul>
<li>This is the same editor we’ve seen before, but now, a location in the tree can be “opened” for editing. This brings up what I’m calling the “explorer”.</li>
<li>When a location is opened, we get some important information, fetched asynchronously from the Unison backend (you might notice the small blip after the explorer comes up, but before it gets all this information):
<ul>
<li>The <em>admissible</em> type, which at the top of the box in bold in the video. More on this below.</li>
<li>The <em>current</em> type, which is simply the type of the node selected. The little grey square is a symbol which represents the current location (we don’t repeat the rendering of it in the explorer since it could be a huge tree), so the annotation <code>□ : Number</code> indicates that the current location has the type <code>Number</code>.</li>
<li>The types of any local variables in scope at the edit location.</li>
<li>Any well-typed <em>completions</em>, which as the video shows can be filtered down by typing in the searchbox. If there are no valid completions that match what the user has typed, the explorer outline turns orange.</li>
</ul>
</li>
<li>I’m not currently pulling any names in; so the names of type variables aren’t very meaningful.</li>
</ul>
<p>Conceptually, the explorer is just providing a very structured way of selecting a replacement value for the current edit location. But it is restricted to choosing values which ensure the resulting expression is still well-typed. This is where the <em>admissible</em> type comes into play. The admissible type must be the <em>subtype</em> of any chosen replacement, so for instance <code>forall a . a -&gt; a</code> is a subtype of <code>Int -&gt; Int</code> (any function expecting an <code>Int -&gt; Int</code> may safely be given a <code>forall a . a -&gt; a</code>). This is not ‘subtyping’ in the OO sense, it’s the subtyping relationship induced by use of quantifiers.</p>
<p>The current type and the admissible type can coincide, though it will always be the case that the admissible type is a subtype of the current type. But, for instance, when replacing the body of the lambda <code>x -&gt; 42</code>, since nothing else constrains what the lambda must return, the admissible type is <code>forall a . a</code> (or unconstrained). If we had a type annotation that was being pushed down into our expression, or if our lambda were named and used elsewhere in a way that constrained its output type, that information would be reflected in the admissible type.</p>
<h4 id="a-idimport-boilerplate-some-notes-on-global-search-and-import-boilerplate"><a id="import-boilerplate"></a> Some notes on global search and import boilerplate</h4>
<p><a href="/2015-02-13/unison-update2.html">Last week</a> I gave a proposed interaction for how to search for terms at more global scope. I suggested that if the explorer is open, adding a <code>?</code> to the front the the search string will increase the search to include global scope. And we could imagine having some other syntax to control the scope of search, to, say, only terms defined in the current <em>panel</em>, or constrained to some <em>package</em>, where a package is just some metadata which points to a bag of terms.</p>
<p>While it might be nice to have the ability to define the scope more precisely, I actually think it’s a better user experience to just make the default scope include <em>everything</em>, and simply <em>display</em> any relevant metadata the user might need to make their selection (or further refine the search). The reason I prefer this is that it’s less work for the user in the cases where the context is sufficiently discriminatory, which is often the case in typeful programming. That is, if the context is sufficiently type-constrained, we can just give the user all results, from all scopes, and free the user from having to decide what rather arbitrary ‘bin’ to start their search in. And if there are too many results, we can order the results intelligently, indicate there are more results to be had, and put the user in charge of refining their search <em>now that it has proven necessary</em>. Of course, the user can still narrow their search to whatever scope up front, it’s just this isn’t the default.</p>
<p>That’s all for now, more updates later this week.</p>
Mon, 23 Feb 2015 00:00:00 +0000http://pchiusano.github.io/2015-02-23/unison-update3.html
http://pchiusano.github.io/2015-02-23/unison-update3.htmlUnison update 2&colon; connecting the editor to the node<p>As I mentioned in <a href="/2015-01-30/unison-update0.html">week 0</a>, the Unison node is written in Haskell and has an implementation of the Unison language, its typechecker, and any primitive functions. It exposes an API over HTTP. <a href="/2015-02-06/unison-update1.html">Last week</a>, I worked on some of the Unison editor interactions, but I was working somewhat in a vaccum since the editor wasn’t yet connected to the Unison node. I spent this week actually getting the editor and the node talking to each other, and refining the node API a bit in the process.</p>
<p>Sadly, I don’t have a demo to show. At one point, I had a round trip sucessfully working, with the editor getting live info from the typechecker, but after that I decided to rework some of the node API to better facilitate the interactions I wanted. I’ve just completed that work today but haven’t hooked the changes back up to the editor. I hope to have something more compelling to show next week.</p>
<p>One of the questions I was thinking through on the node API is how to handle search, and at what scope to do this search. In Unison, terms are uniquely identified by a nameless hash (more on that in a later post) and don’t live in packages or modules. Instead, we may search for terms in various ways, including by “name”. A term may have multiple names, and different nodes may refer to the same hash with different names as names are just metadata stored separately. (A “package” could be a concept, but again, it’s just a metadata thing which points to a collection of hashes) This raises interesting questions, though: if the user has clicked on a node to edit it, what completions should they be shown? Here are some possible answers:</p>
<ul>
<li>All possible terms that typecheck and match the search query (could be a huge set)</li>
<li>Only ‘local’ terms that typecheck and match the search query (requires an explicit imports action)</li>
<li>Something in between</li>
</ul>
<p>In raw text editing, it’s necessary to disambiguate references with explicit imports. Some IDEs will automate this for you, but the model is the same; the presentation of the program must disambiguate user intent, leading to 50 lines of imports at the top of every “file”, where its often arbitrary how codebase is split up into files. In a semantic editor, we can dispense with the ceremony of imports and instead let the user disambiguate at the point of reference. Here’s a very simple proposed interaction:</p>
<ul>
<li>Upon opening a node to edit, we search local scope based on the search query.</li>
<li>If the search query starts with <code>"?"</code> we include global scope as well. If the number of matches is too high, we display only the count.</li>
</ul>
<p>Notice that unlike using something like Hoogle “manually”, we don’t have to enter the type, since the editor knows the type of the current edit location and can supply that information for us! Thus, we just focus on specifying whatever metadata we’d like to search by.</p>
<p>There are some further refinements to this basic interaction to improve the ability to quickly refine our search. I’ll discuss these in a later post.</p>
<p>Looking forward to next week!</p>
Fri, 13 Feb 2015 00:00:00 +0000http://pchiusano.github.io/2015-02-13/unison-update2.html
http://pchiusano.github.io/2015-02-13/unison-update2.htmlUnison update 1&colon; support for asynchronous server requests and a basic editing layer<p>I spent this week working on a very basic editing layer for the Unison editor. Previously, I’d implemented logic for <a href="/2014-11-13/program-as-ui.html">how to display a Unison panel</a>, including embedded graphical views, but had no way of actually editing these panels. Time to do something about that!</p>
<p><a href="/resources/unison/editor0.html">DEMO</a> (explanation further down)</p>
<p><strong>Aside:</strong> A Unison <em>panel</em> is what I am calling the editable rendering of a Unison term. It may contain both “source” views and embedded graphics, and also <em>reactivity</em>, where parts of the panel are updated instantly as other parts of the panel are edited (I’ll talk more about this in a later post).</p>
<p>My goal here was to get the basics of this editing story worked out, and also do a bit of tinkering to get a sense of how fluid the semantic editing UX could be. I mentioned <a href="/2015-01-30/unison-update0.html">last week</a> that I’d like to preserve the same <em>feeling</em> of directness and control I have when using a raw text editor like Vim, even if the particulars are going to be quite different. And though I wasn’t expecting to get all the way there in one week, I did want to get to a point where I felt comfortable with moving on to other things.</p>
<h3 id="the-problems-with-what-you-type-is-what-you-get-wytiwyg">The problems with ‘What You Type Is What You Get’ (WYTIWYG)</h3>
<p>In a regular text editor or even the fanciest IDE, the editing cursor is positioned directly in the document, and whatever keys you press (sans modifier keys) get interpreted and inserted directly into the document. For instance, if I want to write the expression:</p>
<pre><code>foo (grr "Alice" 42) (prob qux razzle)
</code></pre>
<p>… I just place the insertion cursor wherever I want in the document and start typing away. I call this the ‘What You Type Is What You Get’ model. This model of editing is so ingrained in most people’s minds that it might not be apparent that <em>it is a choice</em>. In the physical world, if we’re writing on a piece of paper with a pen, well then yes, our “edits” (handwriting) must be directly applied to the document. In the software world, it is a <em>choice</em> whether to mimic this experience for an editor.</p>
<p>And there are <a href="/2014-09-30/punchcard-era.html">a number of problems with using the WYTIWYG editing model for programming</a>:</p>
<ul>
<li>WYTIWYG <em>overspecifies</em>. The user is put in charge of things like inserting parentheses and typing out full identifier names, when <em>in context</em> they should only have to disambiguate among <em>valid possibilities</em>. Even in an IDE, in the WYTIWYG model, the user must distinguish between finishing typing an identifier (which might refer to a nonexistent symbol) and <em>accepting the current completion</em>.</li>
<li>Programs are highly structured. Raw text is completely unstructured. Thus, we have an impedence mismatch—our way of specifying programs contains a lot of junk. Most raw text is not well-formed (does not even parse) and certainly is not well-typed (it makes no sense and is not executable).</li>
<li>This mismatch generates some very hairy problems for compiler authors. When the user enters a program that is junk, we need to provide the user with feedback in the form of <em>parse errors</em> and <em>type errors</em>. Generating good error messages is difficult, tedious, a significant amount of work, and worst of all <em>uninteresting</em>. Of course, it can be improved, but is that really the <a href="http://www.imdb.com/title/tt0107290/quotes?item=qt1464414">right problem to solve?</a> If the existing error messages of most compilers are any indication, it’s a problem people aren’t particularly interested in solving well, either, even if it is possible.</li>
<li>The user experience for editing raw text programs is awful. Tiny changes can generate screenfuls of errors, which must be deciphered by the user. Type errors even in nice compilers like GHC are <a href="https://www.youtube.com/watch?v=rdVqQUOvxSU">like little riddles</a>—“X is different than Y. Now where did the typechecker come up with X and Y? Have fun tracking that down!”. I’ve been doing FP for a while now, in Haskell, Scala, and now Elm. About 95% of type errors I understand and fix in under 30 seconds. About 4% take me a minute or two. And every 1%, I stare at it for way too long and start losing flow. I think I understand the problem, and make a change. Another screenful of errors. Flow is now officially broken, and demoralization starts setting in. The other day I actually gave up after staring at an error for close to 15 minutes and <a href="https://groups.google.com/forum/#!searchin/elm-discuss/staring$20at$20this$20type$20error/elm-discuss/3qAG7SLOFK0/9Hssqqk9vZoJ"><em>asked on the Elm mailing list if anyone could explain the error I was seeing</em></a>! If we want to teach children and newcomers about programming, can we really expect them to deal with arcane nonsense like this? Of course not!</li>
</ul>
<p>Our industry is full of apologists for the status quo who wear their ability to deal with this nonsense as some kind of badge of honor or proof of their intelligence. “You n00b, you just have to learn to read type errors better, obviously!” Well, there is essential complexity, and there is accidental complexity caused by making silly assumptions. We are better off applying our intelligence and creativity to dealing with problems that are <em>essentially challenging</em>, not problems that just happen to be so.</p>
<h3 id="progress-on-fluid-semantic-editing">Progress on fluid semantic editing</h3>
<p>Let’s go back to our example:</p>
<pre><code>foo (grr "Alice" 42) (prob qux razzle)
</code></pre>
<p>In a semantic editor, here’s a possible keysequence which reproduces this:</p>
<pre><code>f (grr "Alice" 42) (p q r
</code></pre>
<p>Notice that it looks the same, but we need not spell out all our identifiers, and we don’t bother closing the final paren. If <code>foo</code> is the only valid completion, then <code>&lt;space&gt;</code> can accept this completion, add another argument, and begin editing that argument. Unlike raw text editing, where typing <code>f</code> may refer to some as yet undeclared identifier (which would not typecheck!), in a semantic editor the user simply cannot reference a nonexistent identifier, so there is no need to force the user to disambiguate whether they meant to do that or not.</p>
<p>In this model, parentheses are more like navigational cues—they tell the editor where to apply edits but aren’t part of the document. They are displayed only if actually necessary, based on precedence information stored as separate metadata.</p>
<p>The other model I’ve experimented with is simply getting rid of parentheses to control editing entirely. Instead, one uses the arrow keys or <code>&lt;hjkl&gt;</code> to navigate, and there are two actions to accept and continue, one indicated by <code>&lt;space&gt;</code> and another indicated by <code>&lt;shift+space&gt;</code>. Say <code>&lt;space&gt;</code> accepts the current completion and advances to the next <em>sibling</em> of the current node, and <code>&lt;shift+space&gt;</code> accepts the current completion and adds a new <em>child</em> to the current node and moves the editing cursor to <em>that</em>.</p>
<p>That might not make much sense, so see <a href="/resources/unison/editor0.html">this demo</a>. Here are some notes:</p>
<ul>
<li>The layout is much like we’ve seen before—layout is computed dynamically based on available width, parens are inserted automatically. Try shrinking the browser window until the expression no longer fits.</li>
<li>There’s a selection model which can be manipulated using the mouse and or arrow keys. Try hovering over any node and then pressing the up arrow.</li>
<li>There’s also now an <em>editing layer</em>. It’s not hooked up to the typechecker (that comes next week), but it gives the flavor of the interaction. Click on an entry in the list of names <code>["Alice", "Bob"...]</code> (or hit ‘enter’ while a name is selected) and start typing in the box. The box turns red if there’s no matching completions. Hitting space accepts the current completion if the box is non-red <em>and advances to the next item in the list</em>. Hitting shift+space accepts the completion, adds a new argument, and moves to editing that argument. No mousing around needed, and with a bit more thought we can add support for insertion (via <code>,</code>) and infix operators, all in a fluid fashion.</li>
</ul>
<p>Another bit I found interesting about this model is that the question of whether Unison is case-sensitive ceases to have real meaning. When editing a node of the panel, the editor can do the search and matching however it chooses, and the resulting selection can be rendered however you choose. In this example, I chose to do case-insensitive search in <a href="/resources/unison/editor0.html">the demo</a>. Put another way, case-sensitivity becomes a property of the <em>search functionality</em> rather than the language itself. Different programmers might reasonable configure or toggle between different search modes, even in the middle of editing their programs!</p>
<p>I haven’t gone too far down the path of exploring how to make the semantic editing experience feel completely fluid, and there are a lot of little details, but I’ve gone far enough that I feel relatively confident that it’s possible and am ready to move onto other things for now. I’ll circle back to this a bit later on.</p>
<h3 id="detour-to-allow-asynchronous-loops-in-elm">Detour to allow asynchronous loops in Elm</h3>
<p>I spent Monday and Tuesday on a minor yak shaving detour to allow definition of cycles in Elm’s <a href="http://package.elm-lang.org/packages/elm-lang/core/1.1.0/Signal">FRP library</a>. <a href="/2015-01-30/unison-update0.html">Unison’s architecture</a> means that there is a lot of back and forth between the editor and the Unison node. Many interactions with the UI have the possibility of generating a web request to fetch information from the node. The result of such requests can affect how subsequent UI interactions are interpreted (for instance, the list of valid completions shown when a node is selected for editing affects how mouse positions and clicks are interpreted) Of course, it is sensible to try to arrange things such that the relevant information is prefetched, but the mere possibility of cycles needs to be accounted for.</p>
<p>Unfortunately, <a href="https://groups.google.com/forum/#!searchin/elm-discuss/http/elm-discuss/hMQTNHVoMeE/klNvRcY_oRMJ">this use case cannot currently be expressed in Elm</a> without major contortions that I wasn’t willing to adopt. There is a new feature planned for Elm 0.15 that looked like it would address my use case, but I didn’t want to block until that was ready, and I also didn’t want to bank on it being exactly what I needed. I also didn’t like the idea of being at the mercy of someone else’s timeline. These three months are extremely valuable to me, and I want to make the most of them! After taking a brief look at what would be involved in writing the feature I needed, it seemed straightforward, so I decided to just go for it and told myself I’d bail and find some other path in 2-3 days. It went… surprisingly well and I packaged up the result as <a href="https://github.com/pchiusano/elm-execute">a new library</a>.</p>
<p>With that in hand, I implemented the following generic function:</p>
<pre><code class="language-Elm">asyncUpdate : (Signal req -&gt; Signal (model -&gt; model))
-&gt; Signal (model -&gt; (Maybe req, model))
-&gt; req
-&gt; model
-&gt; Signal model
asyncUpdate eval actions req0 model0 = ...
</code></pre>
<p>It feels a bit ad hoc, but it wraps up what I suspect is a pretty common pattern of interpreting a signal of actions which will be used to update some model, some of which may “on the side” generate requests which must be run asynchronously while other actions come in. The first argument, <code>eval</code>, is the signal transformer which (may) issue asynchronous requests and use the results to update the model. But the function has no opinion on whether these are actually web requests, or something faked with a time delay and some pure logic locally. The current demo is running using the following function as the <code>eval</code> argument:</p>
<pre><code class="language-Elm">search : Sink Field.Content -&gt; Signal Request -&gt; Signal (Model -&gt; Model)
search searchbox reqs =
let containsNocase sub overall = String.contains (String.toLower sub) (String.toLower overall)
possible = ["Alice", "Alicia", "Bob", "Burt", "Carol", "Carolina", "Dave", "Don", "Eve"]
matches query = List.filter (containsNocase query) possible
go _ model = -- our logic is pure, ignore the request
let possible = matches (Explorer.getInputOr Field.noContent model.explorer).string
in updateExplorerValues searchbox (List.map Terms.str possible) model
in Time.delay (200 * Time.millisecond) (Signal.map go reqs)
</code></pre>
<p>Not very exciting, but I can easily swap in something that actually contacts the real Unison node over HTTP. I just have to swap out this one function!</p>
<h3 id="next-week">Next week</h3>
<p>This post is getting pretty long, so I’ll call it quits here. Next week I’ll be hooking this editing layer up to the real Unison node and typechecker. Stay tuned! As always I welcome your comments.</p>
Fri, 06 Feb 2015 00:00:00 +0000http://pchiusano.github.io/2015-02-06/unison-update1.html
http://pchiusano.github.io/2015-02-06/unison-update1.html