User login

Navigation

Expressions of Change

I have been working on a project called "Expressions of Change"; I thought the audience of Lambda the Ultimate might find it interesting.

The aim of the project is to improve the tools for constructing ever-changing computer programs by putting the changes themselves central in the programming experience. That is: reify changes, and use those reified changes as the main interface across the programming toolchain. Because the project rejects the history-less file as a basic building block, a first implementation step is to build the prototype of an editor that constructs such primitives of change instead.

I can't see a programmer ever inspecting (and certainly not writing) such fragments. So what's the point of focusing on a language for these changes? The codebase / history query API seems to be the interesting problem. How the information is stored can be hidden and optimized for query performance as is usual with databases.

One other note is that although text doesn't store enough program structure to support the kinds of queries we want, neither do S-exprs. For example, S-exprs don't allow you to query the history of a particular binding. No encoding will be "fully structured" (even if you could enforce type correctness, you can't eliminate semantic programming errors), and so rather than trying to find richer and richer encodings, I think we should rather allow queries of annotations on less structured encodings. A text encoding of programs is fine if we have a powerful way to query it.

Also, have you thought about integrating undo/redo functionality into this system? That's another reason I think it would be nice to have all of the deltas from the project beginning at the finest level (text for a non-structural editor). But then I'd want to be able to distribute a much coarser history rather than sharing my every typo with the world. But this corresponds to "big deltas" that you can't break down into a list of incremental changes.

I most certainly did think about undo/redo. I would propose to take a slightly more indirect approach to those ideas though. Note that (in the representation presented) we could take the s-expressions that represent "notes" (i.e. modifications) and modify such notes, tracking the history using yet another score. In an editor that shows a structure on the right hand side panel, and a mechanism of construction on the left (such as the current prototype) such a recursive application of the ideas from the paper on themselves could be visualized by shifting everything one panel to the right.

In that view, "undo" is simply the [repeated] removal of the last modification. Such a approach also opens up the ability for non-standard mechanisms of undo, i.e. undoing something that's not the last modification.

In this approach, redo requires another level of recursive application of modification-tracking, as the undoing of the last removal of a modification corresponds to a redo.

Again... making the above practical from a UI perspective is somewhat assumed, but will require considerable work in practice.

I also thought about "not sharing every typo with the world", and would solve it along the same lines: by editing the history. One path towards making that practical could be to offer analyses of what constituted typos automatically (e.g. any modification that is later overwritten by another modification is a candidate for cleanup)

Regarding the programmer writing and inspecting such fragments: you are certainly right that the notation in terms of s-expression is somewhat verbose. The point of focusing on a language for such changes anyway is to get the API right on a conceptual level; how to display and create such elements in practice is the next question. The notation you've quoted is one that works reasonably well on paper; in a more interactive environment we can be smarter about this.

With regards to creating such elements: in practice this means that the editor that's a part of the project creates them directly out of edit commands. That is: as the user interacts with the structure, a log ('score') of edit actions is preserved. With regards to viewing them: I'm working on various alternative renderings of such modifications that are somewhat more user friendly.

Regarding the second part of your question, you are right that the encoding I have presented is agnostic as to the meaning/semantics of the s-expressions. That is, not even a typical interpretation such as "the first element of a list expression implies either a special form or function application" is assumed in the paper. Having said that, the more structured approach at the level of modifications to s-expressions does most certainly allow for more structured approaches the implied modifications over semantics (though this is not yet shown in the paper).

I recall reading once a forum discussion between Linus Torvalds and someone else (I thought it was someone involved in another project - perhaps Darcs - but I can't find it) debating how version control should work. The question was whether we should view source control as a tracking a sequence of values, for which deltas are just an optimization (Linus' position), or as tracking the deltas themselves.

It seems to me the version tracker has to at least be delta aware if we're going to use it for undo/redo (it would be impractical to perform a diff of large text files in response to each keystroke - we need to preserve the deltas generated by the editor).

I'm skeptical of exposing the set of possible deltas for inspection, as you seem to have done with that language, though. It seems like a situation where we want to have an open set of possible deltas, accessed through an interface that keeps them abstract. There are many possible deltas at different semantic levels that we could be interested in preserving.

In many (most?) places, it seems the most straightforward semantics for version history is as a sequence of values. Writing tools in a functional reactive style responding to code changes seems like a good idea.

You might check out Darcs and its patch calculus if you haven't. It allows you to replay a patch at a different starting location to do something like the selective undo you mentioned.

That is at least one instance where Linus makes the case for a simple model: one of snapshots of the state of your project on every commit, rather than something supposedly more elegant such as tracking file renames. In fact, I agree with him there: if all you know is the state of the project at certain moment, you shouldn't try to be all too clever when merging fails.

The thing I don't generally agree with is the premise: we don't have to accept that all we can do to track change is to observe the project at certain points in time ("commits") and build our VCS on top of those snapshots. That premise is a consequence of version management being a post-hoc construct: we designed our programming environments (based on files) first, and only then came up with version management as a requirement.

As an aside: calling mechanisms of construction "deltas" betrays exactly a snapshot-based paradigm. Deltas are what you get when you take a look at 2 snapshots and calculate the difference.

I agree with you that it's better to track all of the changes with version control, not just commits. There's a performance benefit and we get a saved undo history for free. I'm skeptical about trying to glean deeper understanding of the meaning of changes from this approach, though. Even with the first simplest thing we might want to do, symbol renaming, there is ambiguity. Trying to figure out how to change the order of two semantic changes to a code base in the "correct" way seems utterly hopeless in the general case. This tells me that while tracking changes like symbol renames might be useful, any such mechanism is going to be ad hoc in nature and we should try to minimize dependencies on these choices.