Wednesday, June 11, 2014

One of the first question a graduate C or C++ programmer, who has just started learning Java ask is, whether Java is a compiled language or an interpreted one? On academic courses or during college, students learn a lot of languages e.g. VB, C, C++ and they happily categories them as either compiled or interpreted, but with Java it's tricky. It's not clear whether Java is compiled or interpreted, because it neither generate machine language code after compiling source file, neither interpreted source file to execute instruction line by line. In order to answer this question you need to fist know that Java is a platform independent language? Which means you can run a Java program to any platform, which includes hardware + operating system, without any modification. Knowing how Java achieves platform independence is key to answer this question. If anyone ask this question during interview, then your answer should be both i.e. Java is both compiled and interpreted programming language. Java code is written in .java files (also known as source file), which is compiled by javac, a Java compiler into class files. Unlike C or C++ compiler, Java compiler doesn't generate native code. These class files contains byte-code, which is different than machine or native code. Java virtual machine or JVM interprets byte codes during execution of Java program. So, you can see it's both compiled and interpreted language, but this answer is incomplete until you mention about JIT (Just in time compiler) which does another round of compilation to produce native code, which can directly be executed by corresponding platform. We will learn about how JIT works in next section.

Java is both Compiled and Interpreted Language but how?

Above answer is absolutely correct but its not complete. What javac (Java compiler, which comes along JDK) does is pseudo compilation, it doesn't convert Java source code into native code, which can directly be executed by CPU, real compilation into native code is done by another program called Just in Time compiler, also known as JIT. This is actually an optimization done on JVM by Java platform engineers. When JVM interprets Java byte code, it also gathers useful statistics, like which part of code is hot and always run. Once JVM has enough data to make such decision, JIT can compile that part of code e.g. method or block into native code. This native code will then directly be executed by machine, without interpreting by JVM. JIT provides immense performance boost to Java application and this is one more reason why Java is also used to write high performance application like electronic trading systems, algorithmic gateways etc, along with native languages like C and C++.

In short, Java is both compiled and interpreted language. It compiles using javac and JIT, and interprets using Java virtual machine. Here is the sequence of things normally happens from writing to execution of a Java program :

1) Programmer write source code and store that into a .java file. Always remember name of your Java source file must be same with the public class declared inside that file, for example if there is public class called Order inside Java file, then it name must be Order.java.

3) JVM executes these class files and gathers statistics of execution run. These statics are used to determine hot spot i.e. part of your code which executes 90% of time.

4) After certain threshold, when JVM has enough data to make decision, JIT compiles frequently used byte codes into native code, which is then directly executed by platform. This provides performance boost to Java application.

That's all folks, You can now say that Java is both compiled and interpreted language. Some people even called Java as dynamically compiled language, referring to compiling byte code into native code during runtime. While answering this question during interview, always remember to mention about JIT compilation as converting Java source code to class files is not really compilation in true C and C++ programming sense, where compiler produce native code.

7 comments
:

Hi Javin,I needed a clarification regarding the JIT compiler. The JIT compiler converts the frequently used part of the code's bytecode into native code which does not need to be interpreted by JVM? Is this part cached or somwthing for future executions?What about the remaining part of the code? That is simply converted by the JVM from byte-code to machine level instructions ?I am little unclear about what you mean by native code v/s machine level instructions to signify the importance of JIT compiler. I suppose they are the same.Could you please elaborate the same.

This article is really examining the wrong question. The only reasonable question is "Is Java an interpreted language?" This is asking whether Java programs are compiled to machine code or to an intermediate form that must be interpreted by a program (called an interpreter) running on the machine.

Virtually all programming languages are compiled. Directly interpreting programming language source code would be so inefficient as to be useless. The only valid question is what the language is compiled to.

IMO there are two ways in which JVM optimizes the bytecode ,one is using JIT compiler in which it compiles every method it counters for the first time and then uses that compiled code for execution but this takes up a lot of memory , another technique it uses is called 'adaptive optimization' in which it identifies the code which is executed frequently and it optimizes that bytecode. This saves a lot of time.

I completely agree with Paul Topping and I have one query based on following statements:

>>> JVM interprets byte codes during execution of Java program.>>> JIT (Just in time compiler) which does another round of compilation to produce native code.

Are you suggesting that during execution there is part of byte-code that gets interpreted by JVM while for some part of byte-code JVM decides to use JIT to convert it to native code and run it directly on machine?

If that's true then that means that for a given java application a part of code might be getting interpreted while part of it might be directly getting executed on machine. Is my understanding correct here?

@Sachin Tiwari, The above article is correct.When the JVM executes a byte code, then it uses a program called "Profiler". The role of this profiler is to identify the block of codes that are being executed frequently. To identify such block of codes, the Profiler uses a counter variable. When the value of this counter has reached a threshold value for a piece of code that is being executed frequently, Then JIT compiler comes in action and executes that piece of code into native machine code and cache it for future use.