An Introduction to Metaprogramming

A metaprogram is a program that generates other programs or program
parts. Hence, metaprogramming means writing metaprograms. Many
useful metaprograms are available for Linux; the most common ones
include compilers (GCC or FORTRAN 77), interpreters (Perl or Ruby), parser
generators (Bison), assemblers (AS or NASM) and preprocessors (CPP or
M4). Typically, you use a metaprogram to eliminate or reduce a tedious
or error-prone programming task. So, for example, instead of writing
a machine code program by hand, you would use a high-level language, such
as C,
and then let the C compiler do the translation to the equivalent low-level machine instructions.

Metaprogramming at first may seem to be an advanced topic, suitable only
for programming language gurus, but it's not really that
difficult once you know how to use the adequate tools.

Source Code Generation

In order to present a very simple example of metaprogramming, let's
assume the following totally fictional situation.

Erika is a very smart first-year undergraduate computer science
student. She already knows several programming languages, including C
and Ruby. During her introductory programming class, Professor Gomez,
the course instructor, caught her chatting on her laptop computer. As
punishment, he demanded Erika write a C program that printed the
following 1,000 lines of text:

1. I must not chat in class.
2. I must not chat in class.
...
999. I must not chat in class.
1000. I must not chat in class.

An additional imposed restriction was that the program could not use
any kind of loop or goto instruction. It should contain only one big
main function with 1,000 printf instructions—something like this:

#include <stdio.h>
int main(void) {
printf("1. I must not chat in class.\n");
printf("2. I must not chat in class.\n");
/* 996 printf instructions omitted. */
printf("999. I must not chat in class.\n");
printf("1000. I must not chat in class.\n");
return 0;
}

Professor Gomez wasn't too naive, so he basically expected Erika to write
the printf instruction once, copy it to the clipboard, do 999 pastes, and
manually change the numbers. He expected that even this amount of irksome
and repetitive work would be enough to teach her a lesson. But, Erika
immediately saw an easy way out—metaprogramming. Instead of writing this
program by hand, why not write another program that writes this program
automatically for her? So, she wrote the following Ruby script:

This code creates a file called punishment.c with the expected
1,000+ lines of C source code.

Although this example might seem a bit fabricated, it
illustrates how easy it is to write a program that produces the source
of another program. This technique can be used in more realistic
settings. Let's say that you have a C program that needs to include a PNG
image, but for some reason, the deployment platform can accept one file
only, the executable file. Thus, the data that conforms the PNG file data
has to be integrated within the program code itself. To achieve this,
we can read the PNG file beforehand and generate the C source text for
an array declaration, initialized with the corresponding data as literal
values. This Ruby script does exactly that:

This script reads the file called ljlogo.png and creates a new
output file called ljlogo.h. First, it writes the declaration of the
variable ljlogo as an array of unsigned characters. Next, it reads
the whole input file at once and unpacks every single input character as
an unsigned byte. Then, it writes each of the input bytes as two-digit
hexadecimal numbers in groups of eight elements per line. As should be
expected, individual elements are terminated with commas, except the last
one. Finally, the script writes the closing brace and semicolon. Here
is a possible output file sample:

The following C program demonstrates how you could use the generated
code as an ordinary C header file. It's important to note that the PNG
file data will be stored in memory when the program itself is loaded:

You also can have a program that both generates source code and executes it
on the spot. Some languages have a facility called eval,
which allows you to translate and execute a piece of source
code contained within a string of characters at runtime. This feature is usually
available in interpreted languages, such as Lisp, Perl, Ruby, Python and
JavaScript. In this Ruby code: