Monday, April 14, 2014

Who is to blame for the Heartbleed bug? Perhaps, it does not matter. Just fix it, and move on. Until the next bug, and the next, and the next.

The Heartbleed bug is different from other Internet scares. It is a vulnerability at the core of the Internet infrastructure, a layer that provides the foundation for secure communication, and it went undetected for years. It should be a wake-up call. Instead, the problem will be patched. Some government and industry flacks will declare the crisis over. We will move on and forget about it.

There is no easy solution. No shortcut. We must redevelop our information infrastructure from the ground up. Everything. Funding and implementing such an ambitious plan may become feasible only after a major disaster strikes that leaves no other alternative. But even if a complete redesign were to become a debatable option, it is not at all clear that we are up to the task.

The Internet is a concurrent and asynchronous system. A concurrent system consists of many independent components like computers and network switches. An asynchronous system operates without a central clock. In synchronous systems, like single processors, a clock provides the heartbeat that tells every component when state changes occur. In asynchronous systems, components are interrupt driven. They react to outside events, messages, and signals as they happen. The thing to know about concurrent asynchronous systems is this: It is impossible to de-bug them. It is impossible to isolate components from one another for testing purposes. The cost of testing quickly becomes prohibitive for each successively smaller marginal reduction in the probability of bugs. Unfortunately, when a system consists of billions of components, even extremely low-probability events are a daily occurrence. These unavoidable fundamental problems are exacerbated by continual system changes in hardware and software and by bad actors seeking to introduce and/or exploit vulnerabilities.

When debugging is not feasible, mathematical rigor is required. Current software-development environments are all about pragmatism, not rigor. Programming infrastructure is built to make programming easy, not rigorous. Most programmers develop their programs in a virtual environment and have no idea how their programs really function. Today's computer-science success stories are high-school geniuses that develop multimillion-dollar apps and college dropouts that start multibillion-dollar businesses. These are built on fast prototypes and viral marketing, not mathematical rigor. Who in their right mind would study computer science from people who made a career writing research proposals that never led to anything worth leaving a paltry academic job for?

Rigor in programming is the domain of Edsger W. Dijkstra, the most (in)famous, admired, and ignored computer-science eccentric. In 1996, he laid out his vision of Very Large Scale Application of Logic as the basis for the next fifty years of computer science. Although the examples are dated, his criticism of the software industry still rings true:

Firstly, simplicity and elegance are unpopular because they require hard work and discipline to achieve and education to be appreciated. Secondly we observe massive investments in efforts that are heading in the opposite direction. I am thinking about so-called design aids such as circuit simulators, protocol verifiers, algorithm animators, graphical aids for the hardware designers, and elaborate systems for version control: by their suggestion of power, they rather invite than discourage complexity. You cannot expect the hordes of people that have devoted a major part of their professional lives to such efforts to react kindly to the suggestion that most of these efforts have been misguided, and we can hardly expect a more sympathetic ear from the granting agencies that have funded these efforts: too many people have been involved and we know from past experience that what has been sufficiently expensive is automatically declared to have been a great success. Thirdly, the vision that automatic computing should not be such a mess is obscured, over and over again, by the advent of a monstrum that is subsequently forced upon the computing community as a de facto standard (COBOL, FORTRAN, ADA, C++, software for desktop publishing, you name it).

The last twenty years were not kind to Dijkstra's vision. The hordes turned into horsemen of the apocalypse that trampled, gored, and burned any vision of rigor in software. For all of us, system crashes, application malfunctions, and software updates are daily occurrences. It is build into our expectation.

In today's computer science, the uncompromising radicals that prioritize rigor do not stand a chance. Today's computer science is the domain of genial consensus builders, merchants of mediocrity that promise everything to everyone. Computer science has become a social construct that evolves according to political rules.

A bottoms-up redesign of our information infrastructure, if it ever becomes debatable, would be defeated before it even began. Those who could accomplish a meaningful redesign would never be given the necessary authority and freedom. Instead, the process would be taken over by political and business forces, resulting into effective status quo.

In 1996, Dijkstra believed this:

In the next fifty years, Mathematics will emerge as The Art and Science of Effective Formal Reasoning, and we shall derive our intellectual excitement from learning How to Let the Symbols Do the Work.

There is no doubt that he would still cling to this goal, but even Dijkstra may have started to doubt his fifty-year timeline.

About Me

Eric is a technology consultant specializing in the strategic application of new technologies in academic computing and library services.

Prior to this, Eric was the Director of Library Information Technology, a Senior Research Associate and Lecturer in Applied Mathematics at the California Institute of Technology. Eric holds a computer science degree from the K. U. Leuven (Belgium) and a Ph. D. in Mathematics from the Courant Institute, New York University, NY. He is the author of papers in scientific computing, library technology, and the graduate textbook "Concurrent Scientific Computing", published by Springer-Verlag.

He chaired the OpenURL standardization committee, which developed the OpenURL ANSI standard. He is on the Board of Directors of the Networked Digital Library of Theses and Dissertations (NDLTD).