Book

PaddlePaddle Fluid: Towards a Compiled Programming Language

As described in fluid.md, when a Fluid application program
runs, it generates a ProgramDesc protobuf message as an intermediate
representation of itself. The C++ class Executor can run this
protobuf message as an interpreter. This article describes the Fluid
compiler.

ProgramDesc

Before we go deeper into the idea of compiled language, let us take a
look at a simple example Fluid application.

Transpilers

We can write a transpiler program that takes a ProgramDesc, e.g.,
the above one, and outputs another ProgramDesc. Let us take some
examples:

Memory optimization transpiler: We can write a transpiler that
inserts some FreeMemoryOps in the above example ProgramDesc so
to free memory early, before the end of an iteration, so to keep a
small memory footprint.

Distributed training transpiler: We can write a transpiler that
converts aProgramDesc into its distributed version of two
ProgramDescs -- one for running by the trainer processes and the
other for the parameter server.

In the rest of this article, we talk about a special kind of
transpiler, Native code generator, which takes a ProgramDesc and
generates a .cu (or .cc) file, which could be built by C++
compilers (gcc, nvcc, icc) into binaries.

Native Code Generator

For the above example, the native code generator transpiler, say, the
CUDA code generator, should generate a main function:

and the definitions of functions fluid_cuda_read,
fluid_cuda_create_tensor, and fluid_cuda_mult. Please be aware
that each function could just define a C++ instance of an operator and
run it. For example