Programming as collaborative reference

We argue that programming-language theory should face the pragmatic fact that humans develop software by interacting with computers in real time [hear, hear]. This interaction not only relies on but also bears on the core design and tools of programming languages, so it should not be left to Eclipse plug-ins and StackOverflow. We further illustrate how this interaction could be improved by drawing from existing research on natural-language dialogue, specifically on collaborative reference.

Overloading resolution is one immediate application. Here, collaborative reference can resolve the tension between fancy polymorphism and the need to write long type annotations everywhere. The user will submit a source code fragment with few or no annotations. If the compiler gets stuck -- e.g., because of ambiguous overloading -- it will ask the user for a hint. At the successful conclusion of the dialogue, the compiler saves all relevant context, in the form of inserted type annotations or as a proof tree attached to the source code. When revisiting the code later on, the same or a different programmer may ask the compiler questions about the inferred types and the reasons for chosen overloadings. We argue that such an interactive overloading resolution should be designed already in the type system (e.g., using oracles).

Comment viewing options

I posted about my approach to this recently. The extended abstract linked above is light on details, but I mostly agree with the sentiment. The term "dialogue" sounds modal, so I don't really like that term. What should IMO happen is that as you type, the IDE labels ambiguity. If, after more context is supplied, the ambiguity remains, the programmer should be able to press a key to interactively select a resolution. In my system, that resolution just adds some code to the program text. That annotation is in a syntax that tells the IDE "don't show this", but if you looked at the source in a text editor, you'd still see it, so no special file format is required.

For me, the guiding principle behind this idea is that all ad hoc selection mechanisms should be resolved at edit time. That way establishing correctness of those complicated mechanisms can be done by inspection ("looks like the editor picked what I meant this time") without worrying about the general correctness of the selection algorithm. And the bit that's left to check with logical reasoning -- the run-time semantics -- is as simple as possible.

I've thought about such ideas before, albeit with my eyes on whiteboard meta-programming - having the IDE fill most of the gaps in the code or rough descriptions to sketch programs and prototypes in the period of an office meeting - rather than code proof. But I never found a satisfactory way to `save all the relevant context`. For my notion of meta-programming, that context was a collaborative search to eliminate ambiguities and meet constraints.

A big problem I had was the problem of further editing and refactoring of the code. It seemed difficult to reuse an old search after the editing. Further, it put a lot of burden on the editor - who might not be the original programmer, or might have forgotten enough to be a stranger to her own code.

This idea of annotating the code with specific references where collaboration and arguments are to occur, is enticing. It offers a structure I had not considered, something that might be leveraged to protect decisions (more relevantly, the arguments and justifications for and relationships between them) across edits.

There is a note in the abstract above that annotations resulting from the dialogue would be saved along with the code. So the author should have resolved the ambiguities while writing the code. If the code is only half finished then you face the same problems that you would when finishing off someone else's program: understanding where the holes are and what to fill them with. Edit: hit the wrong reply button, this was supposed to be a response to the post above by migmit.

So, best case scenario is that nothing really changes: instead of error messages we would have "questions".

But non-interactive error messages have the advantage of being asynchronous. You're given several errors; you can choose which errors to start with â€” for example, sometimes fixing small errors first makes it easier to fix big ones.

If your programming language allows you to use REPL, you can load those modules that do compile successfully and try several alternatives before specifying one of them. You can choose just to comment out the whole body of a function so that everything else would compile and return to that function afterwards. There are tons of possibilities, none of them being just "answer the question".

But non-interactive error messages have the advantage of being asynchronous. You're given several errors; you can choose which errors to start with â€” for example, sometimes fixing small errors first makes it easier to fix big ones.

I think there's another salient point here too: what if an error only occurs because of another ambiguity or error elsewhere? It seems handling errors/ambiguities would have to be ordered somehow, but a natural total order doesn't seem obvious to me. You would also have to be able to backtrack and make an alternate choice in case a previous choice was incorrect, and you only realize that subsequently.

I look forward to reviewing this more in depth as I have time to plow through the paper.

The language described in the abstract is one that does not require a Q&A session. The programmer is free to write all of the annotations required to get the program to compile first time. Skipping some of those annotations to see if the compile is successful is optional. So your first two points don't really match the description. You seem to drawing a comparison with a different system.

Haskell allows the programmer to skip type annotations. When it gets confused it presents an error message to try and explain what the problem is. The suggestion in the abstract / talk is that this dialog can be richer, using some of the stateful construct that we see in natural language. The current process is two isolated steps:
1. A message attempting to explain all of the information to locate the error.
2. A response in the form of a program that fixes the problem.

The suggestion is that this process could be expanded to allow the compiler and programmer to iteratively converge on a meaning that describes the problem. The rest of your "what would really happen" doesn't have any justification. You seem to be arguing that having a compiler that communicates in this way would be more confusing to programmer, could you provide a reason why?

I've been assisting (acting as a sounding board really) a speech researcher with a Siri-like dialogue systems lately. Basically, you parse utterances to (a) figure out the type of the command to be executed and (b) fill slots for the command. For any slot that needs to be bound unambiguously, the system can then ask a question to the programmer asking them to be clarify; it will also use your context (history, location, time of day, ...) to fill in slots if appropriate. As we've seen with Siri, this technology is sort of ready for use by consumers, not just the speech recognition component, but we (or at least Apple) seem to be breaking through on the dog-brain IQ needed to make a digital assistant useful in real life (also adding some EQ to improve the fun factor).

An utterance for one command is very different from writing a program. It is more like driving command-line shell or REPL. But if we think of these commands as instead mutating a single program at a very high-level, perhaps we could make some progress; think "the program is a game," "this stage has a maze," "when pac-man hits a ghost he dies." Hugo Liu and Lieberman's work on Metafor makes some progress in this direction, and is also very interactive (the shell will ask for feedback). Metafor will only generate a skeleton, so its not that useful, but we could imagine designing a programming language designed to allow programs to be expressed in this way.

This coding style sounds excellent. Worth noting is that practical IDEs already work this way, even if many language designers might not have considered it.

One question I have is why people feel the annotations should be some sort of side file or hidden annotation. Why not disambiguate the program and then reinsert the unambiguous version in the edit buffer?

Let me give two concrete examples where this appears to work well. First, in Java IDEs, you typically do not write down your import statements explicitly. Instead, you use unqualified names all over the place when you write the initial version of your code. At some point, the IDE does a compile, and it (non-modally!) indicates which parts of the code use an identifier that's not imported. The user can then interactively select which import to use. At that point, the import is inserted into the edit buffer, up at the top of the file. It's out of sight, but it's still ordinary text that will be saved in the file.

As another concrete example, Smalltalk programmers can go a long way without spelling identifiers all the way out. They just write the first few characters down. Eventually, they do a compile, and the system will ask--modally, I'm afraid--which identifier was really meant by each shorthand. After the user makes a selection in each case, the expanded identifier is inserted back into the code.

I don't see why it would be beneficial to hide the disambiguating information, and in fact it looks like a loss of clarity to people who later try to read the code. To draw an analogy, which way would you want spell checkers to work? Do you want them to leave badly spelled words in the text and put the correct word somewhere else, or should the replace the badly spelled word by the correct form?

I can speak only for myself, but I hide information because it can be distracting. It comes up particularly with symbols, like * or () -- does that mean dot product, scalar product or cross product? Grouping or tuple construction? The way my system works, you select which one you want via the IDE and it gets converted into a disambiguating text tag. Once you've disambiguated it with tags, it can be rendered with a specialized glyph (e.g. × or · or ⊗). But sometimes you want to render it with that special glyph and also display a disambiguating tag. So I have a special syntax that inserts a hidden tag for the other case.