Having armed ourselves to the teeth with information, and having hand-built a few extensions, we are now ready to exploit SWIG and XS to their hilts. In this section, we'll first look at the type of code produced by XS. As it happens, SWIG produces almost identical code, so the explanation should suffice for both tools. Then we will write typemaps and snippets of code to help XS deal with C structures, to wrap C structures with Perl objects, and, finally, to interface with C++ objects. Most of this discussion is relevant to SWIG also, which is why we need study only one SWIG example. That said, take note that the specific XS typemap examples described in the following pages are solved simply and elegantly using SWIG, without the need for user-defined typemaps.

To understand
XS typemaps, and the effect of keywords such as
CODE
and
PPCODE
, it pays to take a good look at the glue code generated by
xsubpp.
Consider the following XS declaration of a module,
Test
, containing a function that takes two arguments and returns an integer:

MODULE = Test PACKAGE = Test
int
func_2_args(a, b)
int a
char* b

xsubpp
translates it to the following (comments in italic have been added):

This is practically identical to the code we studied in the section
"The Called Side: Hand-Coding an XSUB
." Notice how the arguments on the stack are translated into the two arguments
a
and
b
. The XS function then calls the real C function,
func_2_args
, gets its return value, and packages the result back to the argument stack.

Let us now add some of the more common XS keywords to see how they are accommodated by
xsubpp
. The XS snippet

As you can see, the code supplied in
PREINIT
goes right after the typemaps to ensure that all declarations are complete before the main code starts. The location is important for traditional C compilers, but would not be an issue for ANSI C or C++ compilers, which allow variable declarations anywhere in a block. The
INIT
section is inserted before the automatically generated call to the function or, in this case, before the
CODE
section starts. The
CODE
directive allows us the flexibility of inserting any piece of code; without it,
xsubpp
would have simply inserted a call to
func_with_keywords(a,b)
, as we saw in the prior example.

The
CODE
keyword behaves like a typical C call: you can modify input parameters, and you can return at most one parameter. To deal with a variable number of input arguments or output results, you need the
PPCODE
keyword. To illustrate the implementation of
PPCODE
, consider a C function,
permute
, that takes a string, computes all its permutations and returns a dynamically allocated array of strings (a null-terminated
char**
). Let's say that we want to access it in Perl as follows:

@list = permute($str);

We use
PPCODE
here because the function expects to return a variable number of scalars. The following snippet of code shows the XS file:

The
PPCODE
directive differs from
CODE
in one small but significant way: it adjusts the stack pointer SP to point to the bottom of the Perl stack frame for this function call (that is, to
ST(0)
), to enable us to use the
XPUSHs
macro to extend and push any number of arguments (recall our discussion in the section
"Ensuring that the stack is big enough
"). We'll shortly see why we cannot do this using typemaps.

A
typemap is a snippet of code that translates a scalar value on the argument stack to a corresponding C scalar entity (int, double, pointer), or vice versa. A typemap applies only to one direction. It is important to stress here that both the input and the output for a typemap are scalars in their respective domains. You cannot have a typemap take a scalar value and return a C structure, for example; you can, however, have it return a
pointer
to the structure. This is the reason why the
permute
example in the preceding section cannot use a typemap. We could write a typemap to convert a
char**
to a
reference
to an array and then leave it to the script writer to dereference it. In SWIG, which doesn't support a
PPCODE
equivalent, this is the only option.

Another constraint of typemaps is that they convert one argument at a time, with blinkers on: you cannot take a decision based on multiple input arguments, as we mentioned in
Chapter 18,
Extending Perl:A First Course
, ("if argument 1 is `foo', then increase argument 2 by 10"). XS offers the
CODE
and
PPCODE
directives to help you out in this situation, while SWIG doesn't. But recall from the section
"Degrees of Freedom"
that the two SWIG restrictions mentioned are easily and efficiently taken care of in script space.

While
xsubpp
is capable of supplying translations for ordinary C arguments, we have to write custom typemaps for all user-defined types. Assume that we have a C library with the following two functions:

As you can see, we need two typemaps: an output typemap for converting a
Car*
to
$car
and an input typemap for the reverse direction. We start off by editing a typemap file called
typemap
,[
11
] which contains three sections:
TYPEMAP
,
INPUT
, and
OUTPUT
, as follows:

[11]
We choose this particular name because the
h2xs
-generated makefile recognizes it and feeds it to
xsubpp
. It also allows for multiple typemap files to be picked up from different directories.

The
TYPEMAP
section creates an easy-to-use alias (
CAR_OBJ
, in this case) for your potentially complex C type (
Car
*
). The
INPUT
and
OUTPUT
sections in the typemap file can now refer to this alias and contain code to transform an object of the corresponding type to a Perl value, or vice versa. When a typemap is used for a particular problem, the marker
$arg
is replaced by the appropriate scalar on the argument stack, and
$var
is replaced by the corresponding C variable name. In this example, the output typemap stuffs a
Car*
into the integer slot of the scalar (recall the discussion in the section
"SVs and object pointers
").

The advantage of the
TYPEMAP
section's aliases is that multiple types can be mapped to the same alias. That is, a
Car*
and a
Plane*
can both be aliased to
VEHICLE
, and because the
INPUT
and
OUTPUT
sections use only the alias, both types end up sharing the same translation code. The Perl distribution comes with a typemap file that supplies all the basic typemaps (see
lib/ExtUtils/typemap
), and you can freely use one of the aliases defined in that file. For example, you can use the alias
T_PTR
(instead of
CAR_OBJ
) and thereby use the corresponding
INPUT
and
OUTPUT
sections for that alias. In other words, our typemap file need simply say:

TYPEMAP
Car * T_PTR

It so happens that the
T_PTR
's
INPUT
and
OUTPUT
sections look identical to that shown above for
CAR_OBJ
.

Let us say we want to give the script writer the ability to write something like the following, without changing the C library in any way:

$car = Car::new_car(); # As before
$car->drive()
;

In other words, the
OUTPUT
section of our typemap needs to convert a
Car*
(returned by
new_car
) to a blessed scalar reference, as discussed in the section
"SVs and object pointers
." The
INPUT
section contains the inverse transformation:

sv_setref_iv
gives an integer to a freshly allocated SV and converts the first argument into a reference, points it to the new scalar, and blesses it in the appropriate module (refer to
Table 20.1
). In this example, we cast the pointer to an
I32
, and make the function think we are supplying an integer.

The typemap in the preceding example is restricted to objects of type
Car
only. We can use the TYPEMAP section's aliasing capability to generalize this typemap and accommodate any object pointer. Consider the following typemap, with changes highlighted:

All we have done is generalize the alias, the cast, and the class name.
$type
is the type of the current C object (the left-hand side of the alias in the
TYPEMAP
section), so in this case it is
Car*
. Because we want to make the class name generic, we adopt the strategy used in
Chapter 7,
Object-Oriented Programming
- ask the script user to use the arrow notation:

$c = Car->new_car();

This invocation supplies the name of the module as the first parameter, which we capture in the
CLASS
argument in the XS file:

Car *
new_car (
CLASS
)
char *CLASS

The only thing remaining is that we would like the user to say
Car->new
instead of
Car->new_car
. Just because C doesn't have polymorphism doesn't mean the script user has to suffer. The
CODE
keyword achieves this simply:

Having generalized this alias, we can apply the
ANY_OBJECT
alias to other objects too, as long as they also follow the convention of declaring and initializing a
CLASS
variable in any method that returns a pointer to the type declared in the
TYPEMAP
section. In the preceding example, the initialization happened automatically because Perl supplies the name of the class as the first argument.

Unlike the previous example,
xsubpp
automatically supplies the
CLASS
variable. You still need the typemaps, however, to convert
Car*
to an equivalent Perl object reference. The
drive
interface declaration is translated as follows:

We have conveniently ignored the issue of memory management so far. In the preceding sections, the
new
function allocates an object that is subsequently stuffed into a scalar value by the typemapping code. When the scalar goes out of scope or is assigned something else, Perl ignores this pointer if the scalar has not been blessed - not surprising, considering that it has been led to believe that the scalar contains just an integer value. This is most definitely a memory leak. But if the scalar is blessed, Perl calls its
DESTROY
routine called when the scalar is cleared. If this routine is written in XS, as shown below, it gives us the opportunity to delete allocated memory:

The Perl library provides a set of functions and macros to replace the conventional dynamic memory management routines (listed on the left-hand side of the table):

Instead of:

Use:

malloc

New

free

Safefree

realloc

Renew

calloc

Newz

memcpy

Move

memmove

Copy

memzero

Zero

The Perl replacements use the version of
malloc
provided by Perl (by default), and optionally collect statistics on memory usage. It is recommended that you use these routines instead of the conventional memory management routines.

SWIG produces practically the same code as
xsubpp
. Consequently, you can expect its typemaps to be very similar (if not identical) to that of XS. Consider the
permute
function discussed earlier. We want a
char**
converted to a list, but since typemaps allow their input and output to be scalars, the following typemap translates it to a list
reference
:

SWIG typemaps are specific to language, hence the
perl5
argument.
out
refers to function return parameters, and this typemap applies to
all
functions with a
char**
return value.
$source
and
$target
are variables of the appropriate types: for an
in
typemap,
$source
is a Perl type, and
$target
is the data type expected by the corresponding function parameter. Note that unlike XS's
$arg
and
$val
, SWIG's
$source
and
$target
switch meanings depending on the direction of the typemap.

If you don't want this typemap applied to all functions returning
char**
's, you can name exactly which parameter or function you want it applied to, like this:

%typemap(perl5,out) char **
permute
{
...
}

Please refer to the
SWIG
documentation for a number of other typemap-related features.