Zend API: Hacking the Core of PHP

Introduction

Those who know don't talk.

Those who talk don't know.

Sometimes, PHP "as is" simply isn't enough. Although these cases are rare
for the average user, professional applications will soon lead PHP to the edge
of its capabilities, in terms of either speed or functionality. New
functionality cannot always be implemented natively due to language
restrictions and inconveniences that arise when having to carry around a huge
library of default code appended to every single script, so another method
needs to be found for overcoming these eventual lacks in PHP.

As soon as this point is reached, it's time to touch the heart of PHP
and take a look at its core, the C code that makes PHP go.

Warning

This information is currently rather outdated,
parts of it only cover early stages of the ZendEngine 1.0 API
as it was used in early versions of PHP 4.

More recent information may be found in the various README files that
come with the PHP source and the
» Internals
section on the Zend website.

Overview

"Extending PHP" is easier said than done. PHP has evolved to a
full-fledged tool consisting of a few megabytes of source code,
and to hack a system like this quite a few things have to be
learned and considered. When structuring this chapter, we finally
decided on the "learn by doing" approach. This is not the most
scientific and professional approach, but the method that's the
most fun and gives the best end results. In the following
sections, you'll learn quickly how to get the most basic
extensions to work almost instantly. After that, you'll learn
about Zend's advanced API functionality. The alternative would
have been to try to impart the functionality, design, tips,
tricks, etc. as a whole, all at once, thus giving a complete look
at the big picture before doing anything practical. Although this
is the "better" method, as no dirty hacks have to be made, it can
be very frustrating as well as energy- and time-consuming, which
is why we've decided on the direct approach.

Note that even though this chapter tries to impart as much
knowledge as possible about the inner workings of PHP, it's
impossible to really give a complete guide to extending PHP that
works 100% of the time in all cases. PHP is such a huge and
complex package that its inner workings can only be understood if
you make yourself familiar with it by practicing, so we encourage
you to work with the source.

What Is Zend? and What Is PHP?

The name Zend refers to the language engine,
PHP's core. The term PHP refers to the
complete system as it appears from the outside. This might sound
a bit confusing at first, but it's not that complicated (
see
below). To implement a Web script interpreter, you need
three parts:

The interpreter part analyzes the input
code, translates it, and executes it.

The functionality part implements the
functionality of the language (its functions, etc.).

The interface part talks to the Web
server, etc.

Zend takes part 1 completely and a bit of part 2; PHP takes parts
2 and 3. Together they form the complete PHP package. Zend itself
really forms only the language core, implementing PHP at its very
basics with some predefined functions. PHP contains all the
modules that actually create the language's outstanding
capabilities.

The following sections discuss where PHP can be extended and how
it's done.

Extension Possibilities

As shown above, PHP can be extended primarily at
three points: external modules, built-in modules, and the Zend
engine. The following sections discuss these options.

External Modules

External modules can be loaded at script runtime using the
function dl(). This function loads a shared
object from disk and makes its functionality available to the
script to which it's being bound. After the script is terminated,
the external module is discarded from memory. This method has both
advantages and disadvantages, as described in the following table:

Advantages

Disadvantages

External modules don't require recompiling of PHP.

The shared objects need to be loaded every time a script is
being executed (every hit), which is very slow.

The size of PHP remains small by "outsourcing" certain
functionality.

External additional files clutter up the disk.

Every script that wants to use an external module's
functionality has to specifically include a call to
dl(), or the extension
tag in php.ini needs to be modified
(which is not always a suitable solution).

To sum up, external modules are great for
third-party products, small additions to PHP that are rarely used,
or just for testing purposes. To develop additional functionality
quickly, external modules provide the best results. For frequent
usage, larger implementations, and complex code, the disadvantages
outweigh the advantages.

Third parties might consider using the
extension tag in php.ini
to create additional external modules to PHP. These external
modules are completely detached from the main package, which is a
very handy feature in commercial environments. Commercial
distributors can simply ship disks or archives containing only
their additional modules, without the need to create fixed and
solid PHP binaries that don't allow other modules to be bound to
them.

Built-in Modules

Built-in modules are compiled directly into PHP and carried around
with every PHP process; their functionality is instantly available
to every script that's being run. Like external modules, built-in
modules have advantages and disadvantages, as described in the
following table:

Advantages

Disadvantages

No need to load the module specifically; the functionality is
instantly available.

Changes to built-in modules require recompiling of PHP.

No external files clutter up the disk; everything resides in
the PHP binary.

The PHP binary grows and consumes more memory.

Built-in modules are best when you have a solid
library of functions that remains relatively unchanged, requires
better than poor-to-average performance, or is used frequently by
many scripts on your site. The need to recompile PHP is quickly
compensated by the benefit in speed and ease of use. However,
built-in modules are not ideal when rapid development of small
additions is required.

The Zend Engine

Of course, extensions can also be implemented directly in the Zend
engine. This strategy is good if you need a change in the language
behavior or require special functions to be built directly into
the language core. In general, however, modifications to the Zend
engine should be avoided. Changes here result in incompatibilities
with the rest of the world, and hardly anyone will ever adapt to
specially patched Zend engines. Modifications can't be detached
from the main PHP sources and are overridden with the next update
using the "official" source repositories. Therefore, this method
is generally considered bad practice and, due to its rarity, is
not covered in this book.

Source Layout

Note:

Prior to working through the rest of this chapter, you should retrieve
clean, unmodified source trees of your favorite Web server. We're working with
Apache (available at
» http://httpd.apache.org/)
and, of course, with PHP (available at
» http://www.php.net/ - does
it need to be said?).

Make sure that you can compile a working PHP environment by
yourself! We won't go into this issue here, however, as you should
already have this most basic ability when studying this chapter.

Before we start discussing code issues, you should familiarize
yourself with the source tree to be able to quickly navigate
through PHP's files. This is a must-have ability to implement and
debug extensions.

Repository for dynamic and built-in modules; by default, these
are the "official" PHP modules that have been integrated into
the main source tree. From PHP 4.0, it's possible to compile
these standard extensions as dynamic loadable modules (at
least, those that support it).

php-src/main

This directory contains the main php macros and definitions. (important)

Discussing all the files included in the PHP package is beyond the
scope of this chapter. However, you should take a close look at the
following files:

php-src/main/php.h, located in the main PHP directory.
This file contains most of PHP's macro and API definitions.

php-src/Zend/zend.h, located in the main Zend directory.
This file contains most of Zend's macros and definitions.

php-src/Zend/zend_API.h, also located in the Zend
directory, which defines Zend's API.

You should also follow some sub-inclusions from
these files; for example, the ones relating to the Zend executor,
the PHP initialization file support, and such. After reading these
files, take the time to navigate around the package a little to see
the interdependencies of all files and modules - how they relate to
each other and especially how they make use of each other. This
also helps you to adapt to the coding style in which PHP is
authored. To extend PHP, you should quickly adapt to this style.

Extension Conventions

Zend is built using certain conventions; to avoid breaking its
standards, you should follow the rules described in the following
sections.

Macros

For almost every important task, Zend ships predefined macros that
are extremely handy. The tables and figures in the following
sections describe most of the basic functions, structures, and
macros. The macro definitions can be found mainly in
zend.h and zend_API.h.
We suggest that you take a close look at these files after having
studied this chapter. (Although you can go ahead and read them
now, not everything will make sense to you yet.)

Memory Management

Resource management is a crucial issue, especially in server
software. One of the most valuable resources is memory, and memory
management should be handled with extreme care. Memory management
has been partially abstracted in Zend, and you should stick to
this abstraction for obvious reasons: Due to the abstraction, Zend
gets full control over all memory allocations. Zend is able to
determine whether a block is in use, automatically freeing unused
blocks and blocks with lost references, and thus prevent memory
leaks. The functions to be used are described in the following
table:

Function

Description

emalloc()

Serves as replacement for
malloc().

efree()

Serves as replacement for
free().

estrdup()

Serves as replacement for
strdup().

estrndup()

Serves as replacement for
strndup(). Faster than
estrdup() and binary-safe. This is the
recommended function to use if you know the string length
prior to duplicating it.

ecalloc()

Serves as replacement for
calloc().

erealloc()

Serves as replacement for
realloc().

emalloc(),
estrdup(), estrndup(),
ecalloc(), and erealloc()
allocate internal memory; efree() frees these
previously allocated blocks. Memory handled by the
e*() functions is considered local to the
current process and is discarded as soon as the script executed by
this process is terminated.

Warning

To allocate resident memory that survives termination of
the current script, you can use malloc() and
free(). This should only be done with extreme
care, however, and only in conjunction with demands of the Zend
API; otherwise, you risk memory leaks.

Zend also features a thread-safe resource manager to
provide better native support for multithreaded Web servers. This
requires you to allocate local structures for all of your global
variables to allow concurrent threads to be run. Because the
thread-safe mode of Zend was not finished back when this was written,
it is not yet extensively covered here.

Directory and File Functions

The following directory and file functions should be used in Zend
modules. They behave exactly like their C counterparts, but
provide virtual working directory support on the thread level.

String Handling

Strings are handled a bit differently by the Zend engine
than other values such as integers, Booleans, etc., which don't require
additional memory allocation for storing their values. If you want to
return a string from a function, introduce a new string variable to the symbol
table, or do something similar, you have to make sure that the memory the
string will be occupying has previously been allocated, using the
aforementioned e*() functions for allocation. (This might
not make much sense to you yet; just keep it somewhere in your head for now - we'll get
back to it shortly.)

Complex Types

Complex types such as arrays and objects require
different treatment. Zend features a single API for these types - they're
stored using hash tables.

Note:

To reduce complexity in the following source examples, we're only
working with simple types such as integers at first. A discussion about
creating more advanced types follows later in this chapter.

PHP's Automatic Build System

PHP 4 features an automatic build system that's very flexible.
All modules reside in a subdirectory of the
ext directory. In addition to its own sources,
each module consists of a config.m4 file, for extension configuration. (for example, see
» http://www.gnu.org/software/m4/)

All these stub files are generated automatically, along with
.cvsignore, by a little shell script named
ext_skel that resides in the
ext directory. As argument it takes the name
of the module that you want to create. The shell script then
creates a directory of the same name, along with the appropriate
stub files.

This instruction creates the
aforementioned files. To include the new module in the automatic
configuration and build process, you have to run
buildconf, which regenerates the
configure script by searching through the
ext directory and including all found
config.m4 files.

If you're unfamiliar with M4 files (now is certainly a good
time to get familiar), this might be a bit confusing at first; but
it's actually quite easy.

Note: Everything prefixed with
dnl is treated as a comment and is not
parsed.

The config.m4 file is responsible for
parsing the command-line options passed to
configure at configuration time. This means
that it has to check for required external files and do similar
configuration and setup tasks.

The default file creates two configuration directives in the
configure script:
--with-my_module and
--enable-my_module. Use the first option when
referring external files (such as the
--with-apache directive that refers to the
Apache directory). Use the second option when the user simply has
to decide whether to enable your extension. Regardless of which
option you use, you should uncomment the other, unnecessary one;
that is, if you're using --enable-my_module, you
should remove support for --with-my_module, and
vice versa.

By default, the config.m4 file created by
ext_skel accepts both directives and
automatically enables your extension. Enabling the extension is
done by using the PHP_EXTENSION macro. To change
the default behavior to include your module into the PHP binary
when desired by the user (by explicitly specifying
--enable-my_module or
--with-my_module), change the test for
$PHP_MY_MODULE to == "yes":

This would require you to use
--enable-my_module each time when reconfiguring
and recompiling PHP.

Note: Be sure to run
buildconf every time you change
config.m4!

We'll go into more details on the M4 macros available to your
configuration scripts later in this chapter. For now, we'll simply
use the default files.

Creating Extensions

We'll start with the creation of a very simple extension at first, which
basically does nothing more than implement a function that returns the
integer it receives as parameter. A simple extension. shows the source.

This code contains a complete PHP module. We'll explain the source
code in detail shortly, but first we'd like to discuss the build
process. (This will allow the impatient to experiment before we
dive into API discussions.)

Note:

The example source makes use of some features introduced with the Zend version
used in PHP 4.1.0 and above, it won't compile with older PHP 4.0.x versions.

Compiling Modules

There are basically two ways to compile modules:

Use the provided "make" mechanism in the
ext directory, which also allows building
of dynamic loadable modules.

Compile the sources manually.

The first method should definitely be favored,
since, as of PHP 4.0, this has been standardized into a
sophisticated build process. The fact that it is so sophisticated
is also its drawback, unfortunately - it's hard to understand at
first. We'll provide a more detailed introduction to this later in
the chapter, but first let's work with the default files.

The second method is good for those who (for some reason) don't
have the full PHP source tree available, don't have access to all
files, or just like to juggle with their keyboard. These cases
should be extremely rare, but for the sake of completeness we'll
also describe this method.

Compiling Using Make

To compile the sample sources using the standard mechanism, copy
all their subdirectories to the ext
directory of your PHP source tree. Then run
buildconf, which will create an updated
configure script containing appropriate
options for the new extension. By default, all the sample sources
are disabled, so you don't have to fear breaking your build
process.

After you run buildconf, configure
--help shows the following additional modules:

The command to compile the module simply instructs the compiler
to generate position-independent code (-fpic shouldn't be
omitted) and additionally defines the constant
COMPILE_DL_FIRST_MODULE to
tell the module code that it's compiled as a dynamically loadable module (the
test module above checks for this; we'll discuss it shortly). After these
options, it specifies a number of standard include paths that should be used
as the minimal set to compile the source files.

Note: All include paths in the example are
relative to the directory ext. If you're
compiling from another directory, change the pathnames
accordingly. Required items are the PHP directory, the
Zend directory, and (if necessary), the
directory in which your module resides.

The link command is also a plain vanilla command instructing linkage as a dynamic module.

You can include optimization options in the compilation
command, although these have been omitted in this example (but some are included in the makefile
template described in an earlier section).

Note: Compiling and linking manually as a
static module into the PHP binary involves very long instructions
and thus is not discussed here. (It's not very efficient to type
all those commands.)

Using Extensions

Depending on the build process you selected, you should either end up
with a new PHP binary to be linked into your Web server (or run as CGI), or with an .so (shared object) file. If you compiled the
example file first_module.c as a shared object, your result file
should be first_module.so. To use it, you first have to copy
it to a place from which it's accessible to PHP. For a simple test procedure,
you can copy it to your htdocs directory and try it with
the source in A test file for first_module.so..
If you compiled it into the PHP binary,
omit the call to dl(), as the module's
functionality is instantly available to your scripts.

Warning

For security reasons, you should not put your
dynamic modules into publicly accessible directories. Even though it can be
done and it simplifies testing, you should put them into a separate directory
in production environments.

Example #3 A test file for first_module.so.

<?php

// remove next comment if necessary// dl("first_module.so");

$param = 2;$return = first_module($param);

print("We sent '$param' and got '$return'");

?>

Calling this PHP file should output the following:

We sent '2' and got '2'

If required, the dynamic loadable module is loaded by calling the
dl() function. This function looks for the
specified shared object, loads it, and makes its functions
available to PHP. The module exports the function
first_module(), which accepts a single
parameter, converts it to an integer, and returns the result of the
conversion.

If you've gotten this far, congratulations! You just built your
first extension to PHP.

Troubleshooting

Actually, not much troubleshooting can be done when compiling
static or dynamic modules. The only problem that could arise is
that the compiler will complain about missing definitions or
something similar. In this case, make sure that all header files
are available and that you specified their path correctly in the
compilation command. To be sure that everything is located
correctly, extract a clean PHP source tree and use the automatic
build in the ext directory with the fresh
files; this will guarantee a safe compilation environment. If this
fails, try manual compilation.

PHP might also complain about missing functions in your module.
(This shouldn't happen with the sample sources if you didn't modify
them.) If the names of external functions you're trying to access
from your module are misspelled, they'll remain as "unlinked
symbols" in the symbol table. During dynamic loading and linkage by
PHP, they won't resolve because of the typing errors - there are no
corresponding symbols in the main binary. Look for incorrect
declarations in your module file or incorrectly written external
references. Note that this problem is specific to dynamic loadable
modules; it doesn't occur with static modules. Errors in static
modules show up at compile time.

Source Discussion

Now that you've got a safe build environment and you're able to include
the modules into PHP files, it's time to discuss how everything works.

Header File Inclusions

The only header file you really have to include for your modules is
php.h, located in the PHP directory. This file makes all
macros and API definitions required to build new modules available to your
code.

Tip: It's good practice to create a separate
header file for your module that contains module-specific
definitions. This header file should contain all the forward
definitions for exported functions and include
php.h. If you created your module using
ext_skel you already have such a header file
prepared.

Declaring Exported Functions

To declare functions that are to be exported (i.e., made available to PHP
as new native functions), Zend provides a set of macros. A sample declaration
looks like this:

ZEND_FUNCTION ( my_function );

ZEND_FUNCTION declares a new C function that complies
with Zend's internal API. This means that the function is of
type void and
accepts INTERNAL_FUNCTION_PARAMETERS (another macro) as
parameters. Additionally, it prefixes the function name with
zif. The immediately expanded version of the above
definitions would look like this:

Since the interpreter and executor core have been separated from
the main PHP package, a second API defining macros and function
sets has evolved: the Zend API. As the Zend API now handles quite
a few of the responsibilities that previously belonged to PHP, a
lot of PHP functions have been reduced to macros aliasing to calls
into the Zend API. The recommended practice is to use the Zend API
wherever possible, as the old API is only preserved for
compatibility reasons. For example, the types zval
and pval are identical. zval is
Zend's definition; pval is PHP's definition
(actually, pval is an alias for zval
now). As the macro INTERNAL_FUNCTION_PARAMETERS
is a Zend macro, the above declaration contains
zval. When writing code, you should always use
zval to conform to the new Zend API.

The number of arguments passed to the Zend function.
You should not touch this directly, but instead use ZEND_NUM_ARGS() to obtain the
value.

return_value

This variable is used to pass any return values of
your function back to PHP. Access to this variable is best done using the
predefined macros. For a description of these see below.

this_ptr

Using this variable, you can gain access to the object
in which your function is contained, if it's used within an object. Use
the function getThis() to obtain this pointer.

return_value_used

This flag indicates whether an eventual return value
from this function will actually be used by the calling script.
0 indicates that the return value is not used;
1 indicates that the caller expects a return value.
Evaluation of this flag can be done to verify correct usage of the function as
well as speed optimizations in case returning a value requires expensive
operations (for an example, see how array.c makes use of
this).

executor_globals

This variable points to global settings of the Zend
engine. You'll find this useful when creating new variables, for example
(more about this later). The executor globals can also be introduced to your
function by using the macro TSRMLS_FETCH().

Declaration of the Zend Function Block

Now that you have declared the functions to be exported, you also
have to introduce them to Zend. Introducing the list of functions is done by
using an array of zend_function_entry. This array consecutively
contains all functions that are to be made available externally, with the function's name
as it should appear in PHP and its name as defined in the C source.
Internally, zend_function_entry is defined as shown in
Internal declaration of zend_function_entry..

You can see that the last entry in the list always has to be
{NULL, NULL, NULL}.
This marker has to be set for Zend to know when the end of the
list of exported functions is reached.

Note:

You cannot use the predefined macros for the
end marker, as these would try to refer to a function named "NULL"!

The macro ZEND_FE (short for 'Zend Function
Entry') simply expands to a structure entry in
zend_function_entry. Note that these macros
introduce a special naming scheme to your functions - your C
functions will be prefixed with zif_, meaning
that ZEND_FE(first_module) will refer to a C
function zif_first_module(). If you want to mix
macro usage with hand-coded entries (not a good practice), keep
this in mind.

Tip: Compilation errors that refer to functions
named zif_*() relate to functions defined
with ZEND_FE.

Defines a function entry of the name name in
zend_function_entry. Requires a corresponding C
function. arg_types needs to be set to NULL.
This function uses automatic C function name generation by prefixing the PHP
function name with zif_.
For example, ZEND_FE("first_module", NULL) introduces a
function first_module() to PHP and links it to the C
function zif_first_module(). Use in conjunction
with ZEND_FUNCTION.

ZEND_NAMED_FE(php_name, name, arg_types)

Defines a function that will be available to PHP by the
name php_name and links it to the corresponding C
function name. arg_types needs to be set
to NULL. Use this function if you don't want the automatic
name prefixing introduced by ZEND_FE. Use in conjunction
with ZEND_NAMED_FUNCTION.

ZEND_FALIAS(name, alias, arg_types)

Defines an alias named alias for
name. arg_types needs to be set
to NULL. Doesn't require a corresponding C
function; refers to the alias target instead.

PHP_FE(name, arg_types)

Old PHP API equivalent of ZEND_FE.

PHP_NAMED_FE(runtime_name, name, arg_types)

Old PHP API equivalent of ZEND_NAMED_FE.

Note: You can't use
ZEND_FE in conjunction with
PHP_FUNCTION, or PHP_FE in
conjunction with ZEND_FUNCTION. However, it's
perfectly legal to mix ZEND_FE and
ZEND_FUNCTION with PHP_FE
and PHP_FUNCTION when staying with the same
macro set for each function to be declared. But mixing is
not recommended; instead, you're advised to
use the ZEND_* macros only.

Declaration of the Zend Module Block

This block is stored in the structure
zend_module_entry and contains all necessary
information to describe the contents of this module to Zend. You can
see the internal definition of this module in
Internal declaration of zend_module_entry..

Usually filled with the
"STANDARD_MODULE_HEADER", which fills these
four members with the size of the whole zend_module_entry, the
ZEND_MODULE_API_NO, whether it is a debug
build or normal build (ZEND_DEBUG) and if
ZTS is enabled (USING_ZTS).

name

Contains the module name (for example, "File
functions", "Socket functions",
"Crypt", etc.). This name will show up in
phpinfo(), in the section "Additional
Modules."

functions

Points to the Zend function block, discussed in the preceding
section.

module_startup_func

This function is called once upon module initialization and can
be used to do one-time initialization steps (such as initial
memory allocation, etc.). To indicate a failure during
initialization, return FAILURE; otherwise,
SUCCESS. To mark this field as unused, use
NULL. To declare a function, use the macro
ZEND_MINIT.

module_shutdown_func

This function is called once upon module shutdown and can be
used to do one-time deinitialization steps (such as memory
deallocation). This is the counterpart to
module_startup_func(). To indicate a failure
during deinitialization, return FAILURE;
otherwise, SUCCESS. To mark this field as
unused, use NULL. To declare a function, use
the macro ZEND_MSHUTDOWN.

request_startup_func

This function is called once upon every page request and can be
used to do one-time initialization steps that are required to
process a request. To indicate a failure here, return
FAILURE; otherwise,
SUCCESS. Note: As
dynamic loadable modules are loaded only on page requests, the
request startup function is called right after the module
startup function (both initialization events happen at the same
time). To mark this field as unused, use
NULL. To declare a function, use the macro
ZEND_RINIT.

request_shutdown_func

This function is called once after every page request and works
as counterpart to request_startup_func(). To
indicate a failure here, return FAILURE;
otherwise, SUCCESS.
Note: As dynamic loadable modules are
loaded only on page requests, the request shutdown function is
immediately followed by a call to the module shutdown handler
(both deinitialization events happen at the same time). To mark
this field as unused, use NULL. To declare a
function, use the macro ZEND_RSHUTDOWN.

info_func

When phpinfo() is called in a script, Zend
cycles through all loaded modules and calls this function.
Every module then has the chance to print its own "footprint"
into the output page. Generally this is used to dump
environmental or statistical information. To mark this field as
unused, use NULL. To declare a function, use
the macro ZEND_MINFO.

version

The version of the module. You can use
NO_VERSION_YET if you don't want to give the
module a version number yet, but we really recommend that you
add a version string here. Such a version string can look like
this (in chronological order): "2.5-dev",
"2.5RC1", "2.5" or
"2.5pl3".

Remaining structure elements

These are used internally and can be prefilled by using the
macro STANDARD_MODULE_PROPERTIES_EX. You
should not assign any values to them. Use
STANDARD_MODULE_PROPERTIES_EX only if you
use global startup and shutdown functions; otherwise, use
STANDARD_MODULE_PROPERTIES directly.

This is basically the easiest and most minimal set of values you
could ever use. The module name is set to First
Module, then the function list is referenced, after which
all startup and shutdown functions are marked as being unused.

For reference purposes, you can find a list of the macros involved
in declared startup and shutdown functions in
Macros to Declare Startup and Shutdown Functions. These are
not used in our basic example yet, but will be demonstrated later
on. You should make use of these macros to declare your startup and
shutdown functions, as these require special arguments to be passed
(INIT_FUNC_ARGS and
SHUTDOWN_FUNC_ARGS), which are automatically
included into the function declaration when using the predefined
macros. If you declare your functions manually and the PHP
developers decide that a change in the argument list is necessary,
you'll have to change your module sources to remain compatible.

Macros to Declare Startup and Shutdown Functions

Macro

Description

ZEND_MINIT(module)

Declares a function for module startup. The generated name will
be zend_minit_<module> (for example,
zend_minit_first_module). Use in
conjunction with ZEND_MINIT_FUNCTION.

ZEND_MSHUTDOWN(module)

Declares a function for module shutdown. The generated name
will be zend_mshutdown_<module> (for
example, zend_mshutdown_first_module). Use
in conjunction with ZEND_MSHUTDOWN_FUNCTION.

ZEND_RINIT(module)

Declares a function for request startup. The generated name
will be zend_rinit_<module> (for
example, zend_rinit_first_module). Use in
conjunction with ZEND_RINIT_FUNCTION.

ZEND_RSHUTDOWN(module)

Declares a function for request shutdown. The generated name
will be zend_rshutdown_<module> (for
example, zend_rshutdown_first_module). Use
in conjunction with ZEND_RSHUTDOWN_FUNCTION.

ZEND_MINFO(module)

Declares a function for printing module information, used when
phpinfo() is called. The generated name will
be zend_info_<module> (for example,
zend_info_first_module). Use in conjunction
with ZEND_MINFO_FUNCTION.

Creation of get_module()

This function is special to all dynamic loadable modules. Take a
look at the creation via the ZEND_GET_MODULE
macro first:

#if COMPILE_DL_FIRSTMOD
ZEND_GET_MODULE(firstmod)
#endif

The function implementation is surrounded by a conditional
compilation statement. This is needed since the function
get_module() is only required if your module is
built as a dynamic extension. By specifying a definition of
COMPILE_DL_FIRSTMOD in the compiler command
(see above for a discussion of the compilation instructions
required to build a dynamic extension), you can instruct your
module whether you intend to build it as a dynamic extension or as
a built-in module. If you want a built-in module, the
implementation of get_module() is simply left
out.

get_module() is called by Zend at load time
of the module. You can think of it as being invoked by the
dl() call in your script. Its purpose is to pass the
module information block back to Zend in order to inform the engine about the
module contents.

If you don't implement a get_module() function in
your dynamic loadable module, Zend will compliment you with an error message
when trying to access it.

Implementation of All Exported Functions

Implementing the exported functions is the final step. The
example function in first_module looks like this:

The function declaration is done
using ZEND_FUNCTION, which corresponds
to ZEND_FE in the function entry table (discussed
earlier).

After the declaration, code for checking and retrieving the function's
arguments, argument conversion, and return value generation follows (more on
this later).

Summary

That's it, basically - there's nothing more to implementing PHP modules.
Built-in modules are structured similarly to dynamic modules, so, equipped
with the information presented in the previous sections, you'll be able to
fight the odds when encountering PHP module source files.

Now, in the following sections, read on about how to make use of PHP's
internals to build powerful extensions.

Accepting Arguments

One of the most important issues for language extensions is
accepting and dealing with data passed via arguments. Most
extensions are built to deal with specific input data (or require
parameters to perform their specific actions), and function
arguments are the only real way to exchange data between the PHP
level and the C level. Of course, there's also the possibility of
exchanging data using predefined global values (which is also
discussed later), but this should be avoided by all means, as it's
extremely bad practice.

PHP doesn't make use of any formal function declarations; this is
why call syntax is always completely dynamic and never checked for
errors. Checking for correct call syntax is left to the user code.
For example, it's possible to call a function using only one
argument at one time and four arguments the next time - both
invocations are syntactically absolutely correct.

Determining the Number of Arguments

Since PHP doesn't have formal function definitions with support
for call syntax checking, and since PHP features variable
arguments, sometimes you need to find out with how many arguments
your function has been called. You can use the
ZEND_NUM_ARGS macro in this case. In previous
versions of PHP, this macro retrieved the number of arguments with
which the function has been called based on the function's hash
table entry, ht, which is passed in the
INTERNAL_FUNCTION_PARAMETERS list. As
ht itself now contains the number of arguments that
have been passed to the function, ZEND_NUM_ARGS
has been stripped down to a dummy macro (see its definition in
zend_API.h). But it's still good practice to
use it, to remain compatible with future changes in the call
interface. Note: The old PHP equivalent of
this macro is ARG_COUNT.

The following code checks for the correct number of arguments:

if(ZEND_NUM_ARGS() != 2) WRONG_PARAM_COUNT;

If the function is not called with two
arguments, it exits with an error message. The code snippet above
makes use of the tool macro WRONG_PARAM_COUNT,
which can be used to generate a standard error message like:

"Warning: Wrong parameter count for firstmodule() in /home/www/htdocs/firstmod.php on line 5"

This macro prints a default error message and then returns to the caller.
Its definition can also be found in zend_API.h and looks
like this:

As you can see, it calls an internal function
named wrong_param_count() that's responsible for printing
the warning. For details on generating customized error
messages, see the later section "Printing Information."

Retrieving Arguments

Note:
New parameter parsing API

This chapter documents the new Zend parameter parsing API
introduced by Andrei Zmievski. It was introduced in the
development stage between PHP 4.0.6 and 4.1.0.

Parsing parameters is a very common operation and it may get a bit
tedious. It would also be nice to have standardized error checking
and error messages. Since PHP 4.1.0, there is a way to do just
that by using the new parameter parsing API. It greatly simplifies
the process of receiving parameters, but it has a drawback in
that it can't be used for functions that expect variable number of
parameters. But since the vast majority of functions do not fall
into those categories, this parsing API is recommended as the new
standard way.

The first argument to this function is supposed to be the number
of actual parameters passed to your function, so
ZEND_NUM_ARGS() can be used for that. The
second parameter should always be TSRMLS_CC
macro. The third argument is a string that specifies the number
and types of arguments your function is expecting, similar to how
printf format string specifies the number and format of the output
values it should operate on. And finally the rest of the arguments
are pointers to variables which should receive the values from the
parameters.

zend_parse_parameters() also performs type
conversions whenever possible, so that you always receive the data
in the format you asked for. Any type of scalar can be converted
to another one, but conversions between complex types (arrays,
objects, and resources) and scalar types are not allowed.

If the parameters could be obtained successfully and there were no
errors during type conversion, the function will return
SUCCESS, otherwise it will return
FAILURE. The function will output informative
error messages, if the number of received parameters does not
match the requested number, or if type conversion could not be
performed.

Of course each error message is accompanied by the filename and
line number on which it occurs.

Here is the full list of type specifiers:

l - long

d - double

s - string (with possible null bytes) and its length

b - boolean

r - resource, stored in zval*

a - array, stored in zval*

o - object (of any class), stored in zval*

O - object (of class specified by class entry), stored in zval*

z - the actual zval*

The following characters also have a meaning in the specifier
string:

| - indicates that the remaining
parameters are optional. The storage variables
corresponding to these parameters should be initialized to
default values by the extension, since they will not be
touched by the parsing function if the parameters are not
passed.

/ - the parsing function will
call SEPARATE_ZVAL_IF_NOT_REF() on
the parameter it follows, to provide a copy of the
parameter, unless it's a reference.

! - the parameter it follows can
be of specified type or NULL (only
applies to a, o, O, r, and z). If NULL
value is passed by the user, the storage pointer will be
set to NULL.

The best way to illustrate the usage of this function is through
examples:

Note that in the last example we pass 3 for the number of received
parameters, instead of ZEND_NUM_ARGS(). What
this lets us do is receive the least number of parameters if our
function expects a variable number of them. Of course, if you want
to operate on the rest of the parameters, you will have to use
zend_get_parameters_array_ex() to obtain
them.

The parsing function has an extended version that allows for an
additional flags argument that controls its actions.

The only flag you can pass currently is ZEND_PARSE_PARAMS_QUIET,
which instructs the function to not output any error messages
during its operation. This is useful for functions that expect
several sets of completely different arguments, but you will have
to output your own error messages.

For example, here is how you would get either a set of three longs
or a string:

With all the abovementioned ways of receiving function parameters
you should have a good handle on this process. For even more
example, look through the source code for extensions that are
shipped with PHP - they illustrate every conceivable situation.

Old way of retrieving arguments (deprecated)

Note:
Deprecated parameter parsing API

This API is deprecated and superseded by the new ZEND
parameter parsing API.

After having checked the number of arguments, you need to get access
to the arguments themselves. This is done with the help of
zend_get_parameters_ex():

All arguments are stored in a zval container,
which needs to be pointed to twice. The snippet above
tries to retrieve one argument and make it available to us via the
parameter pointer.

zend_get_parameters_ex() accepts at least two
arguments. The first argument is the number of arguments to
retrieve (which should match the number of arguments with which
the function has been called; this is why it's important to check
for correct call syntax). The second argument (and all following
arguments) are pointers to pointers to pointers to
zvals. (Confusing, isn't it?) All these pointers
are required because Zend works internally with
**zval; to adjust a local **zval in
our function, zend_get_parameters_ex() requires
a pointer to it.

The return value of zend_get_parameters_ex()
can either be SUCCESS or
FAILURE, indicating (unsurprisingly) success or
failure of the argument processing. A failure is most likely
related to an incorrect number of arguments being specified, in
which case you should exit with
WRONG_PARAM_COUNT.

zend_get_parameters_ex() only checks whether
you're trying to retrieve too many parameters. If the function is
called with five arguments, but you're only retrieving three of
them with zend_get_parameters_ex(), you won't
get an error but will get the first three parameters instead.
Subsequent calls of zend_get_parameters_ex()
won't retrieve the remaining arguments, but will get the same
arguments again.

Dealing with a Variable Number of Arguments/Optional Parameters

If your function is meant to accept a variable number of
arguments, the snippets just described are sometimes suboptimal
solutions. You have to create a line calling
zend_get_parameters_ex() for every possible
number of arguments, which is often unsatisfying.

For this case, you can use the
function zend_get_parameters_array_ex(), which accepts the
number of parameters to retrieve and an array in which to store them:

First, the number of arguments is checked to make sure that it's in the accepted range. After that,
zend_get_parameters_array_ex() is used to
fill parameter_array with valid pointers to the argument
values.

fsockopen() accepts two, three, four, or five
parameters. After the obligatory variable declarations, the
function checks for the correct range of arguments. Then it uses a
fall-through mechanism in a switch() statement
to deal with all arguments. The switch()
statement starts with the maximum number of arguments being passed
(five). After that, it automatically processes the case of four
arguments being passed, then three, by omitting the otherwise
obligatory break keyword in all stages. After
having processed the last case, it exits the
switch() statement and does the minimal
argument processing needed if the function is invoked with only
two arguments.

This multiple-stage type of processing, similar to a stairway, allows
convenient processing of a variable number of arguments.

Accessing Arguments

To access arguments, it's necessary for each argument to have a
clearly defined type. Again, PHP's extremely dynamic nature
introduces some quirks. Because PHP never does any kind of type
checking, it's possible for a caller to pass any kind of data to
your functions, whether you want it or not. If you expect an
integer, for example, the caller might pass an array, and vice
versa - PHP simply won't notice.

To work around this, you have to use a set of API functions to
force a type conversion on every argument that's being passed (see
Argument Conversion Functions).

Note: All conversion functions expect a
**zval as parameter.

Argument Conversion Functions

Function

Description

convert_to_boolean_ex()

Forces conversion to a Boolean type. Boolean values remain
untouched. Longs, doubles, and strings containing
0 as well as NULL values will result in
Boolean 0 (FALSE). Arrays and objects are
converted based on the number of entries or properties,
respectively, that they have. Empty arrays and objects are
converted to FALSE; otherwise, to TRUE. All other values
result in a Boolean 1 (TRUE).

convert_to_long_ex()

Forces conversion to a long, the default integer type. NULL
values, Booleans, resources, and of course longs remain
untouched. Doubles are truncated. Strings containing an
integer are converted to their corresponding numeric
representation, otherwise resulting in 0.
Arrays and objects are converted to 0 if
empty, 1 otherwise.

convert_to_double_ex()

Forces conversion to a double, the default floating-point
type. NULL values, Booleans, resources, longs, and of course
doubles remain untouched. Strings containing a number are
converted to their corresponding numeric representation,
otherwise resulting in 0.0. Arrays and
objects are converted to 0.0 if empty,
1.0 otherwise.

convert_to_string_ex()

Forces conversion to a string. Strings remain untouched. NULL
values are converted to an empty string. Booleans containing
TRUE are converted to "1", otherwise
resulting in an empty string. Longs and doubles are converted
to their corresponding string representation. Arrays are
converted to the string "Array" and
objects to the string "Object".

convert_to_array_ex(value)

Forces conversion to an array. Arrays remain untouched.
Objects are converted to an array by assigning all their
properties to the array table. All property names are used as
keys, property contents as values. NULL values are converted
to an empty array. All other values are converted to an array
that contains the specific source value in the element with
the key 0.

convert_to_object_ex(value)

Forces conversion to an object. Objects remain untouched.
NULL values are converted to an empty object. Arrays are
converted to objects by introducing their keys as properties
into the objects and their values as corresponding property
contents in the object. All other types result in an object
with the property scalar , having the
corresponding source value as content.

convert_to_null_ex(value)

Forces the type to become a NULL value, meaning empty.

Note:

You can find a demonstration of the behavior in
cross_conversion.php on the accompanying
CD-ROM.

Using these functions on your arguments will ensure type safety
for all data that's passed to you. If the supplied type doesn't
match the required type, PHP forces dummy contents on the
resulting value (empty strings, arrays, or objects,
0 for numeric values, FALSE
for Booleans) to ensure a defined state.

Following is a quote from the sample module discussed
previously, which makes use of the conversion functions:

After retrieving the parameter pointer, the parameter value is
converted to a long (an integer), which also forms the return value of
this function. Understanding access to the contents of the value requires a
short discussion of the zval type, whose definition is shown in PHP/Zend zval type definition..

Actually, pval (defined in php.h) is
only an alias of zval (defined in zend.h),
which in turn refers to _zval_struct. This is a most interesting
structure. _zval_struct is the "master" structure, containing
the value structure, type, and reference information. The substructure
zvalue_value is a union that contains the variable's contents.
Depending on the variable's type, you'll have to access different members of
this union. For a description of both structures, see
Zend zval Structure,
Zend zvalue_value Structure and
Zend Variable Type Constants.

0 means that this variable is not a reference; 1 means that this variable is a reference to another variable.

refcount

The number of references that exist for this variable. For
every new reference to the value stored in this variable,
this counter is increased by 1. For every lost reference,
this counter is decreased by 1. When the reference counter
reaches 0, no references exist to this value anymore, which
causes automatic freeing of the value.

Zend zvalue_value Structure

Entry

Description

lval

Use this property if the variable is of the
type IS_LONG,
IS_BOOLEAN, or IS_RESOURCE.

dval

Use this property if the variable is of the
type IS_DOUBLE.

str

This structure can be used to access variables of
the type IS_STRING. The member len contains the
string length; the member val points to the string itself. Zend
uses C strings; thus, the string length contains a trailing
0x00.

ht

This entry points to the variable's hash table entry if the variable is an array.

obj

Use this property if the variable is of the
type IS_OBJECT.

Zend Variable Type Constants

Constant

Description

IS_NULL

Denotes a NULL (empty) value.

IS_LONG

A long (integer) value.

IS_DOUBLE

A double (floating point) value.

IS_STRING

A string.

IS_ARRAY

Denotes an array.

IS_OBJECT

An object.

IS_BOOL

A Boolean value.

IS_RESOURCE

A resource (for a discussion of resources, see the
appropriate section below).

IS_CONSTANT

A constant (defined) value.

To access a long you access zval.value.lval, to
access a double you use zval.value.dval, and so on.
Because all values are stored in a union, trying to access data
with incorrect union members results in meaningless output.

Accessing arrays and objects is a bit more complicated and
is discussed later.

Dealing with Arguments Passed by Reference

If your function accepts arguments passed by reference that you
intend to modify, you need to take some precautions.

What we didn't say yet is that under the circumstances presented so
far, you don't have write access to any zval containers
designating function parameters that have been passed to you. Of course, you
can change any zval containers that you created within
your function, but you mustn't change any zvals that refer to
Zend-internal data!

We've only discussed the so-called *_ex() API
so far. You may have noticed that the API functions we've used are
called zend_get_parameters_ex() instead of
zend_get_parameters(),
convert_to_long_ex() instead of
convert_to_long(), etc. The
*_ex() functions form the so-called new
"extended" Zend API. They give a minor speed increase over the old
API, but as a tradeoff are only meant for providing read-only
access.

Because Zend works internally with references, different variables
may reference the same value. Write access to a
zval container requires this container to contain
an isolated value, meaning a value that's not referenced by any
other containers. If a zval container were
referenced by other containers and you changed the referenced
zval, you would automatically change the contents
of the other containers referencing this zval
(because they'd simply point to the changed value and thus change
their own value as well).

zend_get_parameters_ex() doesn't care about
this situation, but simply returns a pointer to the desired
zval containers, whether they consist of references
or not. Its corresponding function in the traditional API,
zend_get_parameters(), immediately checks for
referenced values. If it finds a reference, it creates a new,
isolated zval container; copies the referenced data
into this newly allocated space; and then returns a pointer to the
new, isolated value.

This action is called zval separation
(or pval separation). Because the *_ex() API
doesn't perform zval separation, it's considerably faster, while
at the same time disabling write access.

To change parameters, however, write access is required. Zend deals
with this situation in a special way: Whenever a parameter to a function is
passed by reference, it performs automatic zval separation. This means that
whenever you're calling a function like
this in PHP, Zend will automatically ensure
that $parameter is being passed as an isolated value, rendering it
to a write-safe state:

my_function(&$parameter);

But this is not the case with regular parameters!
All other parameters that are not passed by reference are in a read-only
state.

This requires you to make sure that you're really working with a
reference - otherwise you might produce unwanted results. To check for a
parameter being passed by reference, you can use the macro
PZVAL_IS_REF. This macro accepts a zval*
to check if it is a reference or not. Examples are given in
in Testing for referenced parameter passing..

Assuring Write Safety for Other Parameters

You might run into a situation in which you need write access to a
parameter that's retrieved with zend_get_parameters_ex()
but not passed by reference. For this case, you can use the macro
SEPARATE_ZVAL, which does a zval separation on the provided
container. The newly generated zval is detached from internal
data and has only a local scope, meaning that it can be changed or destroyed
without implying global changes in the script context:

SEPARATE_ZVAL uses emalloc()
to allocate the new zval container, which means that even if you
don't deallocate this memory yourself, it will be destroyed automatically upon
script termination. However, doing a lot of calls to this macro
without freeing the resulting containers will clutter up your RAM.

Note: As you can easily work around the lack
of write access in the "traditional" API (with
zend_get_parameters() and so on), this API
seems to be obsolete, and is not discussed further in this
chapter.

Creating Variables

When exchanging data from your own extensions with PHP scripts, one
of the most important issues is the creation of variables. This
section shows you how to deal with the variable types that PHP
supports.

Overview

To create new variables that can be seen "from the outside" by the
executing script, you need to allocate a new zval
container, fill this container with meaningful values, and then
introduce it to Zend's internal symbol table. This basic process
is common to all variable creations:

zval *new_variable;
/* allocate and initialize new container */
MAKE_STD_ZVAL(new_variable);
/* set type and variable contents here, see the following sections */
/* introduce this variable by the name "new_variable_name" into the symbol table */
ZEND_SET_SYMBOL(EG(active_symbol_table), "new_variable_name", new_variable);
/* the variable is now accessible to the script by using $new_variable_name */

The macro MAKE_STD_ZVAL allocates a new
zval container using ALLOC_ZVAL
and initializes it using INIT_ZVAL. As
implemented in Zend at the time of this writing,
initializing means setting the reference
count to 1 and clearing the
is_ref flag, but this process could be extended
later - this is why it's a good idea to keep using
MAKE_STD_ZVAL instead of only using
ALLOC_ZVAL. If you want to optimize for speed
(and you don't have to explicitly initialize the
zval container here), you can use
ALLOC_ZVAL, but this isn't recommended because
it doesn't ensure data integrity.

ZEND_SET_SYMBOL takes care of introducing the
new variable to Zend's symbol table. This macro checks whether the
value already exists in the symbol table and converts the new
symbol to a reference if so (with automatic deallocation of the
old zval container). This is the preferred method
if speed is not a crucial issue and you'd like to keep memory
usage low.

Note that ZEND_SET_SYMBOL makes use of the Zend
executor globals via the macro EG. By
specifying EG(active_symbol_table), you get access to the
currently active symbol table, dealing with the active, local scope. The local
scope may differ depending on whether the function was invoked from
within a function.

If you need to optimize for speed and don't care about optimal memory
usage, you can omit the check for an existing variable with the same value and instead
force insertion into the symbol table by using
zend_hash_update():

The variables generated with the snippet above will always be of local
scope, so they reside in the context in which the function has been called. To
create new variables in the global scope, use the same method
but refer to another symbol table:

The macro ZEND_SET_SYMBOL is now being
called with a reference to the main, global symbol table by referring
EG(symbol_table).

Note: The active_symbol_table
variable is a pointer, but symbol_table is not.
This is why you have to use
EG(active_symbol_table) and
&EG(symbol_table) as parameters to
ZEND_SET_SYMBOL - it requires a pointer.

Similarly, to get a more efficient version, you can hardcode the
symbol table update:

Creating variables with different scopes. shows a sample source that
creates two variables - local_variable with a local scope
and global_variable with a global scope (see Figure 9.7).
The full example can be found on the CD-ROM.

Note: You can see that the global variable is actually not accessible from
within the function. This is because it's not imported into the local scope
using global $global_variable; in the PHP source.

Longs (Integers)

Now let's get to the assignment of data to variables, starting with
longs. Longs are PHP's integers and are very simple to store. Looking at
the zval.value container structure discussed earlier in this
chapter, you can see that the long data type is directly contained in the union,
namely in the lval field. The corresponding
type value for longs is IS_LONG
(see Creation of a long.).

Doubles (Floats)

Doubles are PHP's floats and are as easy to assign as longs, because their value
is also contained directly in the union. The member in the
zval.value container is dval;
the corresponding type is IS_DOUBLE.

Strings

Strings need slightly more effort. As mentioned earlier, all strings
that will be associated with Zend's internal data structures need to be
allocated using Zend's own memory-management functions. Referencing of static
strings or strings allocated with standard routines is not allowed. To assign
strings, you have to access the structure str in
the zval.value container. The corresponding type
is IS_STRING:

ZVAL_STRING accepts a third parameter that
indicates whether the supplied string contents should be duplicated (using
estrdup()). Setting this parameter
to 1 causes the string to be
duplicated; 0 simply uses the supplied pointer for the
variable contents. This is most useful if you want to create a new variable
referring to a string that's already allocated in Zend internal memory.

If you want to truncate the string at a certain position or you
already know its length, you can use ZVAL_STRINGL(zval,
string, length, duplicate), which accepts an explicit
string length to be set for the new string. This macro is faster
than ZVAL_STRING and also binary-safe.

To create empty strings, set the string length to 0 and
use empty_string as contents:

Booleans

The corresponding macros for this type
are ZVAL_BOOL (allowing specification of the value) as well
as ZVAL_TRUE and ZVAL_FALSE (which
explicitly set the value to TRUE and FALSE,
respectively).

Arrays

Arrays are stored using Zend's internal hash tables, which can be
accessed using the zend_hash_*() API. For every
array that you want to create, you need a new hash table handle,
which will be stored in the ht member of the
zval.value container.

There's a whole API solely for the creation of arrays, which is extremely
handy. To start a new array, you call
array_init().

Adds a string with the desired
length length to the array. This function is faster and binary-safe. Otherwise, behaves like add_index_string().

add_next_index_zval(zval *array, zval *value);()

Adds a zval to the array. Useful for adding other arrays, objects, streams, etc...

All these functions provide a handy abstraction to Zend's internal hash
API. Of course, you can also use the hash functions directly - for example, if
you already have a zval container allocated that you want to
insert into an array. This is done using zend_hash_update()
for associative arrays (see Adding an element to an associative array.) and
zend_hash_index_update() for indexed arrays
(see Adding an element to an indexed array.):

Note: To return arrays from a function, use array_init() and
all following actions on the predefined variable return_value
(given as argument to your exported function; see the earlier discussion of the call interface). You do not have to use
MAKE_STD_ZVAL on this.

Tip: To avoid having to
write new_array->value.ht every time, you can
use HASH_OF(new_array), which is also recommended for
compatibility and style reasons.

Objects

Since objects can be converted to arrays (and vice versa), you
might have already guessed that they have a lot of similarities to
arrays in PHP. Objects are maintained with the same hash
functions, but there's a different API for creating them.

Adds a string of the specified length to the object. This
function is faster than add_property_string() and also
binary-safe.

add_property_zval(zval *obect, char *key, zval *container):()

Adds a zval container to the object. This is useful if you
have to add properties which aren't simple types like integers or strings but
arrays or other objects.

Resources

Resources are a special kind of data type in PHP. The term
resources doesn't really refer to any special
kind of data, but to an abstraction method for maintaining any kind
of information. Resources are kept in a special resource list within
Zend. Each entry in the list has a correspondending type definition
that denotes the kind of resource to which it refers. Zend then
internally manages all references to this resource. Access to a
resource is never possible directly - only via a provided API. As soon
as all references to a specific resource are lost, a corresponding
shutdown function is called.

For example, resources are used to store database links and file
descriptors. The de facto standard implementation
can be found in the MySQL module, but other modules such as the Oracle
module also make use of resources.

Note:

In fact, a resource can be a pointer to anything you need to
handle in your functions (e.g. pointer to a structure) and the
user only has to pass a single resource variable to your
function.

To create a new resource you need to register a resource
destruction handler for it. Since you can store any kind of data as a
resource, Zend needs to know how to free this resource if its not longer
needed. This works by registering your own resource destruction handler
to Zend which in turn gets called by Zend whenever your resource can be
freed (whether manually or automatically). Registering your resource
handler within Zend returns you the resource
type handle for that resource. This handle is needed
whenever you want to access a resource of this type later and is most
of time stored in a global static variable within your extension.
There is no need to worry about thread safety here because you only
register your resource handler once during module initialization.

There are two different kinds of resource destruction handlers you can
pass to this function: a handler for normal resources and a handler
for persistent resources. Persistent resources are for example used
for database connection. When registering a resource, either of these
handlers must be given. For the other handler just pass
NULL.

zend_register_list_destructors_ex() accepts the
following parameters:

ld

Normal resource destruction
handler callback

pld

Pesistent resource destruction
handler callback

type_name

A string specifying the name of
your resource. It's always a good thing to
specify a unique name within PHP for the resource type
so when the user for example calls
var_dump($resource);
he also gets the name of the resource.

module_number

The module_number
is automatically available in your
PHP_MINIT_FUNCTION
function and therefore you just pass it over.

The return value is a unique integer ID for your
resource type.

The resource destruction handler (either normal or persistent
resources) has the following prototype:

One important thing to mention: If your resource
is a rather complex structure which also contains pointers to
memory you allocated during runtime you have to free them
before freeing
the resource itself!

Now that we have defined

what our resource is and

our resource destruction handler

we can go on and do the rest of the steps:

create a global variable within the extension holding
the resource ID so it can be accessed from every function
which needs it

To actually register a new resource you use can either use
the zend_register_resource() function or
the ZEND_REGISTER_RESOURE() macro, both
defined in zend_list.h. Although the arguments for both map
1:1 it's a good idea to always use macros to be upwards
compatible:

The returned rsrc_id uniquely identifies the newly
registered resource. You can use the macro
RETURN_RESOURE to return it to the user:

RETURN_RESOURCE(rsrc_id)

Note:

It is common practice that if you want to return the resource
immediately to the user you specify the return_value
as the zval * container.

Zend now keeps track of all references to this resource. As soon as
all references to the resource are lost, the destructor that you
previously registered for this resource is called. The nice thing
about this setup is that you don't have to worry about memory leakages
introduced by allocations in your module - just register all memory
allocations that your calling script will refer to as resources. As
soon as the script decides it doesn't need them anymore, Zend will
find out and tell you.

Now that the user got his resource, at some point he is passing it
back to one of your functions. The value.lval inside
the zval * container contains the key to your
resource and thus can be used to fetch the resource with the following
macro:
ZEND_FETCH_RESOURCE:

This is your pointer which will
point to your previously registered resource.

rsrc_type

This is the typecast argument for
your pointer, e.g. myresource *.

rsrc_id

This is the address of the
zval *container the user passed to
your function, e.g. &z_resource if
zval *z_resource is given.

default_rsrc_id

This integer specifies the default
resource ID if no resource could be fetched
or -1.

resource_type_name

This is the name of the requested resource.
It's a string and is used when the resource can't be
found or is invalid to form a meaningful error
message.

resource_type

The resource_type
you got back when registering the resource destruction handler.
In our example this was le_myresource.

This macro has no return value.
It is for the developers convenience and takes care
of TSRMLS arguments passing and also does check if the resource
could be fetched.
It throws a warning message and returns the current PHP function
with NULL if there was a problem retrieving the
resource.

To force removal of a resource from the list, use the function
zend_list_delete(). You can also force the
reference count to increase if you know that you're creating another
reference for a previously allocated value (for example, if you're
automatically reusing a default database link). For this case, use the
function zend_list_addref(). To search for
previously allocated resource entries, use
zend_list_find(). The complete API can be found
in zend_list.h.

Macros for Automatic Global Variable Creation

In addition to the macros discussed earlier, a few macros allow
easy creation of simple global variables. These are nice to know
in case you want to introduce global flags, for example. This is
somewhat bad practice, but Table Macros for Global Variable Creation
describes macros that do
exactly this task. They don't need any zval
allocation; you simply have to supply a variable name and value.

Macros for Global Variable Creation

Macro

Description

SET_VAR_STRING(name, value)

Creates a new string.

SET_VAR_STRINGL(name, value,
length)

Creates a new string of the specified length. This macro
is faster than SET_VAR_STRING and also binary-safe.

SET_VAR_LONG(name, value)

Creates a new long.

SET_VAR_DOUBLE(name, value)

Creates a new double.

Creating Constants

Zend supports the creation of true constants (as opposed to
regular variables). Constants are accessed without the typical
dollar sign ($) prefix and are available in all
scopes. Examples include TRUE and
FALSE, to name just two.

To create your own constants, you can use the macros in
Macros for Creating Constants.
All the macros create a constant with the specified name and value.

You can also specify flags for each constant:

CONST_CS - This constant's name is to be
treated as case sensitive.

CONST_PERSISTENT - This constant is
persistent and won't be "forgotten" when the current process
carrying this constant shuts down.

There are two types of
macros - REGISTER_*_CONSTANT
andREGISTER_MAIN_*_CONSTANT. The first type
creates constants bound to the current module. These constants are
dumped from the symbol table as soon as the module that registered
the constant is unloaded from memory. The second type creates
constants that remain in the symbol table independently of the
module.

Registers a new constant of type string. The string length
is explicitly set to length. The specified string must reside
in Zend's internal memory.

Duplicating Variable Contents: The Copy Constructor

Sooner or later, you may need to assign the contents of one
zval container to another. This is easier said than
done, since the zval container doesn't contain only
type information, but also references to places in Zend's internal
data. For example, depending on their size, arrays and objects may
be nested with lots of hash table entries. By assigning one
zval to another, you avoid duplicating the hash
table entries, using only a reference to them (at most).

To copy this complex kind of data, use the copy
constructor. Copy constructors are typically defined in
languages that support operator overloading, with the express
purpose of copying complex types. If you define an object in such a
language, you have the possibility of overloading the "=" operator,
which is usually responsible for assigning the contents of the
rvalue (result of the evaluation of the right side of the operator)
to the lvalue (same for the left side).

Overloading means assigning a different
meaning to this operator, and is usually used to assign a function
call to an operator. Whenever this operator would be used on such
an object in a program, this function would be called with the
lvalue and rvalue as parameters. Equipped with that information, it
can perform the operation it intends the "=" operator to have
(usually an extended form of copying).

This same form of "extended copying" is also necessary for PHP's
zval containers. Again, in the case of an array,
this extended copying would imply re-creation of all hash table
entries relating to this array. For strings, proper memory
allocation would have to be assured, and so on.

Zend ships with such a function,
called zend_copy_ctor() (the previous PHP equivalent
was pval_copy_constructor()).

A most useful demonstration is a function that accepts a complex type as
argument, modifies it, and then returns the argument:

The first part of the function is plain-vanilla argument retrieval.
After the (left out) modifications, however, it gets interesting:
The container of parameter is assigned to the
(predefined) return_value container. Now, in order
to effectively duplicate its contents, the copy constructor is
called. The copy constructor works directly with the supplied
argument, and the standard return values are
FAILURE on failure and
SUCCESS on success.

If you omit the call to the copy constructor in this example, both
parameter and return_value would
point to the same internal data, meaning that
return_value would be an illegal additional
reference to the same data structures. Whenever changes occurred in
the data that parameter points to,
return_value might be affected. Thus, in order to
create separate copies, the copy constructor must be used.

The copy constructor's counterpart in the Zend API, the destructor
zval_dtor(), does the opposite of the
constructor.

Returning Values

Returning values from your functions to PHP was described briefly
in an earlier section; this section gives the details. Return
values are passed via the return_value variable,
which is passed to your functions as argument. The
return_value argument consists of a
zval container (see the earlier discussion of the
call interface) that you can freely modify. The container itself is
already allocated, so you don't have to run
MAKE_STD_ZVAL on it. Instead, you can access its
members directly.

Returns a string. The duplicate flag indicates
whether the string should be duplicated using
estrdup().

RETURN_STRINGL(string, length, duplicate)

Returns a string of the specified length; otherwise, behaves
like RETURN_STRING. This macro is faster
and binary-safe, however.

RETURN_EMPTY_STRING()

Returns an empty string.

RETURN_FALSE

Returns Boolean false.

RETURN_TRUE

Returns Boolean true.

Predefined Macros for Setting the Return Value
of a Function

Macro

Description

RETVAL_RESOURCE(resource)

Sets the return value to the specified
resource.

RETVAL_BOOL(bool)

Sets the return value to the specified
Boolean value.

RETVAL_NULL

Sets the return value to NULL.

RETVAL_LONG(long)

Sets the return value to the specified long.

RETVAL_DOUBLE(double)

Sets the return value to the specified double.

RETVAL_STRING(string, duplicate)

Sets the return value to the specified string and duplicates
it to Zend internal memory if desired (see also
RETURN_STRING).

RETVAL_STRINGL(string, length, duplicate)

Sets the return value to the specified string and forces the
length to become length (see also
RETVAL_STRING). This macro is faster and
binary-safe, and should be used whenever the string length is
known.

RETVAL_EMPTY_STRING

Sets the return value to an empty string.

RETVAL_FALSE

Sets the return value to Boolean false.

RETVAL_TRUE

Sets the return value to Boolean true.

Complex types such as arrays and objects can be returned by using
array_init() and
object_init(), as well as the corresponding hash
functions on return_value. Since these types cannot
be constructed of trivial information, there are no predefined
macros for them.

Printing Information

Often it's necessary to print messages to the output stream from
your module, just as print would be used
within a script. PHP offers functions for most generic tasks, such
as printing warning messages, generating output for
phpinfo(), and so on. The following sections
provide more details. Examples of these functions can be found on
the CD-ROM.

zend_printf()

zend_printf() works like the
standard printf(), except that it prints to Zend's
output stream.

zend_error()

zend_error() can be used to generate error messages.
This function accepts two arguments; the first is the error type (see
zend_errors.h), and the second is the error message.

zend_error(E_WARNING, "This function has been called with empty arguments");

Zend's Predefined Error Messages. shows a list
of possible values (see below). These
values are also referred to in php.ini. Depending on
which error type you choose, your messages will be logged.

Zend's Predefined Error Messages.

Error

Description

E_ERROR

Signals an error and terminates execution of the script
immediately.

E_WARNING

Signals a generic warning. Execution continues.

E_PARSE

Signals a parser error. Execution continues.

E_NOTICE

Signals a notice. Execution continues. Note that by
default the display of this type of error messages is turned off in
php.ini.

E_CORE_ERROR

Internal error by the core; shouldn't be used by
user-written modules.

E_COMPILE_ERROR

Internal error by the compiler; shouldn't be used by
user-written modules.

E_COMPILE_WARNING

Internal warning by the compiler; shouldn't be used by
user-written modules.

After creating a real module, you'll want to show information
about the module in phpinfo() (in addition to the
module name, which appears in the module list by default). PHP allows
you to create your own section in the phpinfo() output with the ZEND_MINFO() function. This function
should be placed in the module descriptor block (discussed earlier) and is
always called whenever a script calls phpinfo().

PHP automatically prints a section
in phpinfo() for you if you specify the ZEND_MINFO
function, including the module name in the heading. Everything else must be
formatted and printed by you.

Typically, you can print an HTML table header
using php_info_print_table_start() and then use the standard
functions php_info_print_table_header()
and php_info_print_table_row(). As arguments, both take the number of
columns (as integers) and the column contents (as strings). Source code and screenshot for output in phpinfo. shows a source example and its output. To print the table footer, use php_info_print_table_end().

Execution Information

You can also print execution information, such as the current file
being executed. The name of the function currently being executed
can be retrieved using the function
get_active_function_name(). This function
returns a pointer to the function name and doesn't accept any
arguments. To retrieve the name of the file currently being
executed, use zend_get_executed_filename().
This function accesses the executor globals, which are passed to
it using the TSRMLS_C macro. The executor globals
are automatically available to every function that's called
directly by Zend (they're part of the
INTERNAL_FUNCTION_PARAMETERS described earlier
in this chapter). If you want to access the executor globals in
another function that doesn't have them available automatically,
call the macro TSRMLS_FETCH() once in that
function; this will introduce them to your local scope.

Finally, the line number currently being executed can be retrieved
using the function zend_get_executed_lineno().
This function also requires the executor globals as arguments. For
examples of these functions, see Printing execution information..

Example #14 Printing execution information.

zend_printf("The name of the current function is %s&lt;br&gt;", get_active_function_name(TSRMLS_C));
zend_printf("The file currently executed is %s&lt;br&gt;", zend_get_executed_filename(TSRMLS_C));
zend_printf("The current line being executed is %i&lt;br&gt;", zend_get_executed_lineno(TSRMLS_C));

Startup and Shutdown Functions

Startup and shutdown functions can be used for one-time
initialization and deinitialization of your modules. As discussed
earlier in this chapter (see the description of the Zend module
descriptor block), there are module, and request startup
and shutdown events.

The module startup and shutdown functions are called whenever a
module is loaded and needs initialization; the request startup and
shutdown functions are called every time a request is processed
(meaning that a file is being executed).

For dynamic extensions, module and request startup/shutdown events
happen at the same time.

Declaration and implementation of these functions can be done with
macros; see the earlier section "Declaration of the Zend Module
Block" for details.

Calling User Functions

You can call user functions from your own modules, which is very
handy when implementing callbacks; for example, for array walking, searching, or
simply for event-based programs.

User functions can be called with the
function call_user_function_ex(). It requires a hash value
for the function table you want to access, a pointer to an object (if you want
to call a method), the function name, return value, number of arguments,
argument array, and a flag indicating whether you want to perform zval separation.

Note that you don't have to specify both
function_table and object; either
will do. If you want to call a method, you have to supply the
object that contains this method, in which case
call_user_function()automatically sets the
function table to this object's function table. Otherwise, you only
need to specify function_table and can set
object to NULL.

Usually, the default function table is the "root" function table
containing all function entries. This function table is part of the
compiler globals and can be accessed using the macro
CG. To introduce the compiler globals to your
function, call the macro TSRMLS_FETCH once.

The function name is specified in a zval
container. This might be a bit surprising at first, but is quite a
logical step, since most of the time you'll accept function names
as parameters from calling functions within your script, which in
turn are contained in zval containers again. Thus,
you only have to pass your arguments through to this function. This
zval must be of type IS_STRING.

The next argument consists of a pointer to the return value. You
don't have to allocate memory for this container; the function will
do so by itself. However, you have to destroy this container (using
zval_dtor()) afterward!

Next is the parameter count as integer and an array containing all
necessary parameters. The last argument specifies whether the
function should perform zval separation - this should always be set
to 0. If set to 1, the
function consumes less memory but fails if any of the parameters
need separation.

Calling user functions. shows a small demonstration of
calling a user function. The code calls a function that's supplied
to it as argument and directly passes this function's return value
through as its own return value. Note the use of the constructor
and destructor calls at the end - it might not be necessary to do
it this way here (since they should be separate values, the
assignment might be safe), but this is bulletproof.

To create an .ini section in your own module, use the
macros PHP_INI_BEGIN() to mark the beginning of such a
section and PHP_INI_END() to mark its end. In between you can
use PHP_INI_ENTRY() to create entries.

The PHP_INI_ENTRY() macro accepts four
parameters: the entry name, the entry value, its change permissions, and a
pointer to a change-notification handler. Both entry name and value must be
specified as strings, regardless of whether they really are strings or
integers.

The permissions are grouped into three
sections:PHP_INI_SYSTEM allows a change only directly in
the php.ini file; PHP_INI_USER allows
a change to be overridden by a user at runtime using additional
configuration files, such as .htaccess; and PHP_INI_ALL allows
changes to be made without restrictions. There's also a fourth level,
PHP_INI_PERDIR, for which we couldn't verify its behavior
yet.

The fourth parameter consists of a pointer to a change-notification
handler. Whenever one of these initialization entries is changed, this handler
is called. Such a handler can be declared using the
PHP_INI_MH macro:

All these definitions can be found
in php_ini.h. Your message handler will have access to a
structure that contains the full entry, the new value, its length, and three
optional arguments. These optional arguments can be specified with the additional
macros PHP_INI_ENTRY1 (allowing one additional
argument), PHP_INI_ENTRY2 (allowing two additional arguments),
and PHP_INI_ENTRY3 (allowing three additional
arguments).

The change-notification handlers should be used to cache initialization
entries locally for faster access or to perform certain tasks that are
required if a value changes. For example, if a constant connection to a
certain host is required by a module and someone changes the hostname,
automatically terminate the old connection and attempt a new one.

Where to Go from Here

You've learned a lot about PHP. You now know how to create
dynamic loadable modules and statically linked extensions. You've
learned how PHP and Zend deal with internal storage of variables and how you
can create and access these variables. You know quite a set of tool functions
that do a lot of routine tasks such as printing informational texts,
automatically introducing variables to the symbol table, and so on.

Even though this chapter often had a mostly "referential" character, we
hope that it gave you insight on how to start writing your own extensions.
For the sake of space, we had to leave out a lot; we suggest that you take the time to
study the header files and some modules (especially the ones in the
ext/standard directory and the MySQL module, as these
implement commonly known functionality). This will give you an idea of how
other people have used the API functions - particularly those that didn't make it into
this chapter.

Reference: Some Configuration Macros

config.m4

The file config.m4 is processed by
buildconf and must contain all the instructions to be
executed during configuration. For example, these can include tests for required
external files, such as header files, libraries, and so on. PHP defines a set of macros
that can be used in this process, the most useful of which are described in
M4 Macros for config.m4.

M4 Macros for config.m4

Macro

Description

AC_MSG_CHECKING(message)

Prints a "checking <message>" text
during configure.

AC_MSG_RESULT(value)

Gives the result to AC_MSG_CHECKING;
should specify either yes or no as value.

AC_MSG_ERROR(message)

Prints message as error message
during configure and aborts the script.

AC_DEFINE(name,value,description)

Adds
#define to php_config.h with the value of
value and a comment that says description (this
is useful for conditional compilation of your module).

AC_ADD_INCLUDE(path)

Adds a compiler include path; for example, used if the
module needs to add search paths for header files.

AC_ADD_LIBRARY_WITH_PATH(libraryname,librarypath)

Specifies an additional library to link.

AC_ARG_WITH(modulename,description,unconditionaltest,conditionaltest)

Quite a powerful macro, adding the
module with description to the
configure --help output. PHP checks
whether the option
--with-<modulename> is given to the
configure script. If so, it runs the
script unconditionaltest (for example,
--with-myext=yes), in which case the value
of the option is contained in the variable
$withval. Otherwise, it executes
conditionaltest.

PHP_EXTENSION(modulename,
[shared])

This macro is a must to call for PHP
to configure your extension. You can supply a second argument
in addition to your module name, indicating whether you intend compilation as a
shared module. This will result in a definition at compile time for your
source as COMPILE_DL_<modulename>.