Software 2.0 – A Paradigm Shift

All these years, software is largely developed in an imperative way. It is characterised by programmers writing explicit instructions to computers to perform tasks in languages like C++, Python, Java etc. Thus, computation provides a framework for dealing precisely with notions of ‘how to’. This imperative style concerned with ‘how to’ contrasts with declarative descriptions concerned with ‘what is’, usually employed in mathematics[1].

With the advent of neural network based deep learning techniques, computing is moving towards a new declarative paradigm, sometimes dubbed as ’Software 2.0’ with earlier imperative software development paradigm being called Software 1.0 retrospectively[2].

In this new paradigm, we will not spell out the steps of the algorithm. Instead, we specify some goal on the intended behaviour of desired program ( Like Recognising emotions in images, win a game of Go, identify spams in e-mails so on), write a skeleton of neural network architecture, throw all computational resource at our disposal and crank the deep learning machine. Lo and Behold! we will end up with a model which will provide results for future datasets. Surprisingly, large portion of real world problems like – Visual Recognition, Speech recognition, Speech synthesis, Machine Translation, Games like Chess and Go – are amenable to be solved in this way. We can collect, curate, massage, clean and label large data sets, train a deep neural network on these datasets to generate a machine learning model and use this model for future use. Here, no algorithm is written explicitly, neural networks are shown enough data of enough variety to come up with a predictive model.

Declarative programming is not entirely new. Relational databases and SQL queries to retrieve information out of databases is an exemplar of declarative specification of computing. Relational database is based on a simple but a very elegant mathematical formalism called ‘Relational Algebra’. Programmers specify what they want from the database via an SQL query and not how to retrieve data from database. Programmers are happily oblivious to the internal organisation of data in the database and algorithms used to optimally retrieve data. Database engines work extra hard to organise data optimally and retrieve data optimally on request via a query.

Another area of declarative computing is functional programming based on the mathematical underpinning called ‘Lambda Calculus’, which predates even digital computers. One of the tenets of functional programming is to use higher order functions, which allows us to compose programs declaratively without getting bogged down by the algorithmic nitty gritty.

For example, consider the below code snippet which extracts the names starting with ‘a’ from a list of names and converts into upper case and creates new list of such transformed names.

Notice how this code involves iterating over the list, checking if the name starts with ‘a’, converting each such name to upper case and adding to the new list. If this operation is quite involved, then it becomes tedious and difficult to reason about the program. We need to follow along the iteration to understand exactly what is happening inside loop body. So, the code is not evident and intent revealing.

On the other hand, consider the same operation performed using higher order functions like map, filter etc.

Here, if we know the semantics of the operations map and filter, we pretty much know what is being done!. We need to just look up as to what operation is being performed in map and filter. This is quite intent revealing and code is concise too. One of the great advantages of such higher order, declarative program is that compiler also can infer the intent easily and can apply optimisations and transformations like parallelisation.

These machine learning models are essentially pure functions devoid of any side effects and are based on solid mathematical ideas of back propagation and stochastic gradient descent. As we have seen previously, software paradigm based on solid mathematical underpinning is destined to succeed. The new paradigm is like turning the test driven development on its head: in test driven development we device test cases and then write code to satisfy the expectation of those test cases. In software development paradigm based on deep learning, we give machine the test cases(train data) and we produce software which will satisfy these test cases based on neural networks.

There is, however, one fundamental difference between the Software 1.0 code and machine learning model. Code is deterministic and discrete. The output of model is probabilistic, uncertain and continuous. A spam filter would not tell whether a mail is spam or not in discrete boolean terms. But, it will tell its confidence as to the possibility of a mail being spam. However, by providing large amount of carefully curated training dataset to the spam filter, we can improve the accuracy of spam filter to any desired level.

As Dr. Meijer puts it succinctly, the future of programming seems to be combining neural nets with probabilistic programming. Companies like Tesla have already made great strides in this direction.