Since Java 1.6 the JVM can run a myriad of programming languages on top of instead of just Java. I conceptually understand how Java is run on the Java VM, but not how other languages can run on it as well. To me, it all looks like black magic. Do you have any articles to point me to so I can better understand how this all fits together?

Same way as your Intel/AMD/Solaris(??) processor can execute "any language" (although you don't really run languages, but just going with the flow here) that can be compiled into its native assembly code.
–
Apoorv KhurasiaJun 7 '12 at 13:25

12

The thing is, the JVM doesn't run Java. It runs a distinct (though related, and intentionally easy for Java compilers to create), more low-level language.
–
delnanJun 7 '12 at 13:30

That is true. But the JVM started running other languages from version 6; you could not (or nobody did) run python or Groovy on it in version 1.4.2. Why is that so? What has changed?
–
PomarioJun 7 '12 at 13:30

@delnan Or rather "more low-level model of execution, that the javac program knows how to build out of Java code".
–
Apoorv KhurasiaJun 7 '12 at 13:31

8

@Pomario Jython has been around for a while now. And this page seems to suggest that Jython scripts could run on 1.4.2.
–
Apoorv KhurasiaJun 7 '12 at 13:35

4 Answers
4

The key is the native language of the JVM: the Java bytecode. Any language can be compiled into bytecode which the JVM understands - all you need for this is a compiler emitting bytecode. From then on, there is no difference from the JVM's point of view. So much so that you can take a compiled Scala, Clojure, Jython etc. class file and decompile it (using e.g. JAD) into normal looking Java source code.

You can find more details about this in the following articles / threads:

I am not aware of any fundamental changes in the Java 5 or 6 JVMs which would have made it possible or easier for (code compiled from) other languages to run on it. In my understanding the JVM 1.4 was more or less as capable in that respect as JVM 6 (there may be differences though; I am not a JVM expert). It was just that people started to develop other languages and/or bytecode compilers in the first half of the decade, and the results started to appear (and become wider known) around 2006 when Java6 was published.

However, all these JVM versions share some limitations: the JVM is statically typed by nature, and up to release 7, did not support dynamic languages. This has changed with the introduction of invokedynamic, a new bytecode instruction which enables method invocation relying on dynamic type checking.

A virtual machine, like the JVM, is a program that accepts as input, usually files, a set of simple instructions (that are usually easy to convert to real CPU instructions), and actually compiles and runs them as native CPU instructions (usually using an on-demand compiler such as HotSpot or JIT).

It's essentially a layer of abstraction. It's usually much easier to port VM instruction set implementations to different processor architectures, because of several similarities (such as being stack based). It's also much easier to port different programming languages to VM instructions, since it's more oriented toward modern programming languages than the primitive CPU instructions. Many Virtual Machines such as the JVM and the CLR (.NET) contain instructions for calling virtual methods, and creating object instances.

So let's take a language for example. Call it MyLanguage. Since it is a programming language, it ultimately compiles down to a set of some CPU architecture instructions.
So that means that, given a compatible, flexible Virtual Machine instruction set, it's also possible to compile MyLanguage down to a set of that VM's instructions.

There are always question of efficiency, since you might need to hack some workarounds in VM instruction sets that you wouldn't have to do natively, but it's still possible.

A JVM is a Turing-complete compute machine (except for limited memory), and any Turing-complete machine (physical or virtual) can execute any programming language (except for memory, performance and physical IO limitations).

Compilers and interpreters, themselves, can run on Turing machines (maybe slowly). Perhaps some pre-compilation/translation steps can improve the performance of running some given program in some given language?
–
hotpaw2Jun 7 '12 at 16:38

1

My point was that your statement "any Turing-complete machine (physical or virtual) can execute any programming language" literally means that the x86 CPU of my laptop can directly execute this nice Java source file I am working on right now. Or machine code for PowerPC processors. Without compilers - CPUs don't contain compilers right? :-)
–
Péter TörökJun 7 '12 at 16:50

@PéterTörök I see your point. He didn't elaborate on VMs like we did. But I think his answer still briefly answers the OP's question. The JVM can "run" other programming languages because it can "run" any programming language, because it is Turing complete. Not elaborate maybe, but still a concise and valid point. :)
–
Yam MarcovicJun 7 '12 at 17:38

For a moment just think of the JVM as a processor with its own instruction set like maybe the x86. The processor can execute say C code that has been compiled into into its machine language. Applying the same analogy to the JVM, Other languages can be executed on the JVM just like on other processors if those languages are compiled down to the machine instructions of the JVM. The JVM can then run these instructions for language X.