Battle lines

Now we have two ways of implementing a high level language – we can compile it to machine code or we can run it on an interpreter for which it IS the machine code.

Traditionally the argument for and against the two approaches goes something like this:

A compiler produces “tight efficient code”.This is supposed to mean that because it generates machine code everything happens as fast as possible. Of course this is nonsense because the machine code could make a lot of use of run time subroutines and so start to slide towards the interpreter approach.

A compiler produces “small stand-alone code”. Clearly if it uses a run time library then it isn’t stand-alone unless the library is included in the compiled code when it isn’t going to be small!

Conversely an interpreter is said to be slow and wasteful of memory space. In fact an interpreter doesn’t have to be slow and a high-level language version of a program can be a lot smaller than a fully compiled machine code version.

So what is the truth?

The fact is that in the past implementations that have described themselves as compilers have been faster than ones that were called interpreters but there has always been a considerable overlap between the two.

Over time the two approaches have tended to become even more blurred and they have borrowed ideas from one another.

For example, the first generation of interpreters usually had excellent debugging facilities. Because the machines they ran on were implemented in software it was an easy task to provide additional facilities that would tell the programmer what was going on.

Interpreters invented idea such as tracing, i.e. following the execution of the program line by line, and dynamic inspection of variable contents etc.

As time went on it became clear that these facilities could be built into a compiler as well by augmenting the run time environment to include them. Many compilers will produce a debug version of the code while you are still testing things and a production version when you have finished.

It is true, however, that in the past interpreted languages were more sophisticated and complex then compiled languages. The reason was simply that writing an interpreter seemed to be an easier thing to do than writing a compiler and so the implementation method chosen tended to limit, or expand, the language according to what was perceived as difficult or easy.

If you make a language static and strongly typed then it seems to be easier to implement using a compiler approach. On the other hand if you use an interpreted approach then its natural to allow the language to be dynamic and allow self modification.

Today we are in a period where static languages such as Java and C# are giving ground to dynamic languages such as Ruby and even JavaScript.

These differences are not new and in many ways they represent the re-emergence of the compiler v interpreter approach to language design and implementation.

Virtual Machines And Intermediate Languages

There is one last development of the interpreter idea that is worth mentioning because it is important today.

An alternative to implementing a machine that runs the high-level language as its machine code is to compile the high-level language to a lower-level language and then run this using an interpreter or VM.

That is instead of writing an interpreter to run Java we first compile it to a simpler language called byte code. Notice we do not compile it to machine code and byte code is still fairly high level compared to machine code. To actually run the Java we use an interpreter or virtual machine for byte code.

This might seem like a very strange idea in that you now have the worst of all possible worlds.

You have to use a compiler to translate the program from one language to another and then you have to use an interpreter to run it.

What could possibly be good about this idea?

The answer is a great deal.

The first advantage is that a compiler from a high-level language to an intermediate-level language is easier to write and can be very efficient.

The second is that an interpreter for an intermediate-level language is easier to write and can also be very efficient.

Looking at things another way we get the best, not the worst, of both approaches!

In addition there is one huge advantage which you might not notice at first. If the interpreter for the intermediate-level language is simple enough then it can be easily implemented on any hardware and this makes programs compiled to the intermediate-level code easily portable between different types of hardware.

If you are really clever then you even write the compiler in the intermediate-level language making it portable as well!

That is we generally call a VM that works directly with a high level language an Interpreter. Hence Basic was generally executed by an interpreter. However if the VM runs an intermediate code produced by a compiler we generally call it a VM. Thus Java is executed by a VM and not an interpreter. This is all the difference amounts to.

The intermediate language is also generally called Pseudo Code, or P-Code for short. P-Code compilers and VMs were very popular in the time before the IBM PC came on the scene (USCD Pascal and Fortran being the best known). Then they more or less vanished, only to return with in a big way with Java but renamed "byte code".

Java’s main claim to fame is that it is the ultimate portable language.

Java VMs exist for most hardware platforms and up to a point you really can compile a Java program and expect it to run on any machine that has a VM. Not only this but the Java compiler and all of the Java system is itself compiled to byte code and so once you have a VM running on new hardware you also have the entire Java system – clever!

.NET languages such as C# and Visual Basic also use an intermediate language and VM approach but due to Microsoft's proprietary approach to computing neither is quite as portable as Java.

This idea is so good that you can expect most language development in the future to be centred on the VM idea. One thing is sure - the future is virtual.