Wednesday, July 31, 2013

The Future Of Programming

Watch it. The conceit is entertaining, from his clothes to the overheads.

However, despite the brilliance of the presentation, I think he might be wrong. And the fact that it's taken 40 years for these promising ideas NOT to take off, may suggest there are some flaws in the ideas themselves.

Coding > Direct Manipulation

Like most visually-oriented people Bret gives great importance to pictures. If I remember correctly, something like 33% of the human brain is visual cortex and specialized in handling our particular 2D + depth way of seeing. So it's hardly surprising that we imagine that this kind of data is important or that we continually look for ways of pressing that part of the brain into service for more abstract data-processing work.

However, most data we want to handle isn't of this convenient 2D or 2.5D form. You can tell this because our text-books are full of different kinds of data-structure, from arrays, lists and queues, to matrices of 2, 3 and higher dimensions, to trees, graphs and relational databases. If most data was 2D, then tables and 2D matrices would be the only data-structures programmers would ever use, and we'd have long swapped our programming languages for spreadsheets.

Higher dimensional and complex data-structures can only be visualized in 2, 2.5 or even 3 dimensions by some kind of projection function. And, Bret, to his credit has invented some ingenious new projections for getting more exotic topologies and dynamics down to 2D. But even so, only a tiny proportion of our actual data-storage requirements are ever likely to be projectable into a visual space.

Once you accept that, then the call for a shift from coding to direct manipulation of data-structures starts to look a lot shakier. Right now, people are using spreadsheets ... in situations which lend themselves to it. Most of the cases where they're still writing programs are cases where such a projection either doesn't exist or hasn't been discovered (in more than 30 years since the invention of the spreadsheet).

Procedures > Goals / Constraints

It seems like it must be so much easier to simply tell the computer what you want rather than how to do it. But how true is that?

It's certainly shorter. But we have a couple of reasons for thinking that it might not be easier.

1) We've had the languages for 40 years. And anyone who's tried to write Prolog knows that it's bloody difficult to formulate your algorithms in such a form. Now that might be because we just don't train and practice enough. But it might be genuinely difficult.

The theoretical / mathematical end of computer science is always trying to sell higher-level abstractions which tend in the direction of declarative / constraint oriented programming, and relatively few people really get it. So I'm not sure how much this is an oversight by the programmer community vs. a genuine difficulty in the necessary thinking.

2) One thing that is certain : programming is very much about breaking complex tasks down into smaller and simpler tasks. The problem with declarative programming is that it doesn't decompose so easily. It's much harder to find part solutions and compose them when declaring a bunch of constraints.

And if we're faced with a trade-off between the virtue of terseness and the virtue of decomposability, it's quite possible that decomposibility trumps terseness.

There may be an interesting line of research here : can we find tools / representations that help in making declarative programs easier to partially specify? Notations that help us "build-up" declarations incrementally?

3) I have a long-standing scepticism from my days working with genetic algorithms that might well generalize to this theme too. With a GA you hope to get a "free lunch". Instead of specifying the design of the solution you want (say in n-bits), you hope you can specify a much shorter fitness function (m-bits) and have the computer find the solution for you.

The problem is that there are many other solutions that the computer can find, that fit the m-bit fitness function but aren't actually (you realize, retrospectively) the n-bit solution that you really want. Slowly you start building up your fitness function, adding more and more constraints to ensure the GA solves it the right rather than wrong way. Soon you find the complexity of your fitness function is approaching the complexity of a hand-rolled solution.

Might the same principle hold here? Declarative programming assumes we can abstract away from how the computer does what it does, but quite often we actually DO need to control that. Either for performance, for fine-tuning the user's experience, for robustness etc.

Anyone with any relational database experience will tell you that writing SQL queries is a tiny fraction of the skills needed for professional database development. Everything else is scaling, sharding, data-mining, Big Data, protecting against failure etc. etc. We used to think that such fine grained control was a temporary embarrassment. OK for systems programmers squeezing the most out of limited memory and processor resources. But once the computers became fast enough we could forget about memory management (give it to the garbage collector) or loop speed (look at that wonderful parallelism). Now we're in the future we discover that caring about the material resources of computation is always the crucial art. One resource constraint becomes cheap or fast enough to ignore, but your applications almost immediately grow to the size that you hit a different limit and need to start worrying again.

Professional software developers NEVER really manage to ignore the materiality of their computation, and so will never really be able to give up fine-grained control to a purely declarative language.

(SQL is really a great example of this. It's the most successful "tell the computer what you want not how you want it done" language in computing history. And yet there's still a lot of tuning of the materiality required, either by db-admins or more recently witnessed by the NoSQL movement, returning to more controllable hierarchical databases, mainly to improve their control.)

I'm as fascinated by visual and gestural ideas for programming as the next geek. But I'm pretty convinced that symbols and language are way, way, way more flexible and powerful representation schemes than diagrams will ever be. Symbols are not limited to two and a half dimensions. Symbols can describe infinite series and trees of infinite depth and breadth. Yadda yadda yadda.

Of course we can do better than the tools we have now. (Our programs could be outlines, wiki-like hypertexts, sometime spreadsheets, network diagrams etc. Or mixes of all of these, as and when appropriate.) But to abandon the underlying infrastructure of symbols, I think is highly unlikely.

Sequential > Parallel

This one's fascinating in that it's the one that seems most plausible. So it's also disturbing to think that it has a history as old as the other (failed) ideas here. If anything, Victor makes me pessimistic about a parallel future by putting it in the company of these other three ideas.

Of course, I'll reserve full judgement on this. I have my Parallella "supercomputer" on order (courtesy of KickStarter). I've dabbled a bit in Erlang. I'm intrigued by Occam-π. And I may even have a go at Go.

And, you know what? In the spirit of humility, and not knowing what I'm doing, I'm going to forget everything I just wrote. I'll keep watching Bret's astounding videos; and trying to get my head around Elm-lang's implementation of FRP. And dreaming of ways that programming will be better in the future.