Understanding the Flex Compiler

STOP! Before you continue, I want to make sure that you have downloaded the compiler source code from the Adobe open source site. Ideally, it would really help if you also have Eclipse setup so you can view the source files easily.

As I mentioned in my previous post, the Flex compiler supports multiple programming languages. In the Flex compiler, these languages are compiled by a family of language-specific compilers. In Eclipse –> Open Type, just type “Compiler” (see below), you should see the compiler classes for the supported languages.

Obviously one can not assume that all programming languages can be compiled in one step. For example, a MXML component may use components written in AS3. Before compiling that MXML component down to some bytecodes, the compiler must locate the dependent AS3 components and verify the methods that the MXML component invokes for correctness.

It is also reasonable to assume that they may not be compiled in the same number of steps. For example, MXML needs twice the number of steps than AS3.

In the Flex compiler, all the language-specific compilers implement the same compiler contract, which is the flex2.compiler.Compiler interface (see below).

As you can see, other than the isSupported() and getSupportedMimeTypes() methods, which are available for the top-level Flex compiler to know which language the Compiler instance supports, the compilation is a 9-step process.

The top-level Flex compiler acts as a coordinator. It is responsible for invoking the Compiler instances. The Compiler instances can expect the call sequence to look something like this:

preprocess (once)

parse1 (once)

parse2 (once)

analyze1 (once)

analyze2 (once)

analyze3 (once)

analyze4 (once)

generate (once)

postprocess (multiple times until the Compiler instance requests a stop)

When the top-level Flex compiler is not calling these methods, it does a number of things (but not necessarily limited to):

picks the appropriate Compiler instance based on the source file type;

learns about unresolved types from the Compiler instances and searches for them in the source-path and library-path;

passes type information to the Compiler instances;

decides which Compiler instance should proceed based on the states of the Compiler instances and the overall resource allocation situation

In order for the top-level Flex compiler to run the show, it requires the Compiler instances to cooperate. Basically, it requires them to produce certain type of information. For example,

A syntax tree must be available at the end of parse2().

analyze1() must identify the superclass name.

analyze2() must identify the remaining dependencies.

analyze4() must make the fully-resolved type info available.

The top-level Flex compiler continues to run the compilation process until:

It no longer needs to look for new dependencies.

The Compiler instances report errors.

As I mentioned above, the top-level Flex compiler would invoke those 9 Compiler methods. Although the call sequence is well-defined, the top-level Flex compiler can still have many different ways to get the compilation done. In fact, there are two compiler algorithms in the Flex compiler. One (flex2.compiler.API.batch1()) is structured and another one (flex2.compiler.API.batch2()) is opportunistic.

API.batch1() is a conservative and more structured algorithm. The main characteristic of this algorithm is that it makes sure that the compilation of all the source files reaches the same state before it proceeds to the next state, e.g. analyze1() gets called for all the files before analyze2().

API.batch2() is a more opportunistic algorithm. The goal of this algorithm is to minimize memory usage. Unlike API.batch1(), source files with fewer dependencies could reach generate() well before source files with more dependencies reach analyze3(). The idea is that as long as a source file gets compiled down to some bytecodes, compiler resources allocated to that file (e.g. memory for the syntax tree) can be freed up immediately.

Now, you pretty much know the basics of the Flex compiler infrastructure. Let’s recap:

The top-level Flex compiler uses either batch1() or batch2() compiler algorithm to compile.

The compiler algorithms use two different strategies to invoke those 9 methods in the Compiler instances.

The participating Compiler instances must cooperate by providing certain information to the top-level Flex compiler at the end of each Compiler method.

The top-level Flex compiler infrastructure is responsible for the dirty works like source-path/library-path searching, hooking up loggers for error reporting, etc. The Compiler implementations do not have to worry about that.

One thing that’s worth noting is that the command-line tools (e.g. mxmlc, compc, asdoc) and the Flex Compiler API all use the same above-mentioned infrastructure and algorithms.

I can talk a little more about this topic but I will stop here and let you read some codes. I will continue in the next post but if you want me to talk about other topics, please drop me a line. Thanks.