Building XMLgawk (old)

May 2013:
Since October 2012 XMLgawk is not updated anymore; XMLgawk is succeeded by the gawk extension libraries.
Consequently I stopped the development of XMLgawk for Windows and continued with the gawk extension libraries for Windows.

Intro

Building XMLgawk for Windows needs to be done in a MinGW/Msys environment. For those who do not have such environment, we will show how to accomplish that. Besides the standard MinGW environment, building XMLgawk requires the Expat and iconv libraries. Further, gawk can use the functionality from the sigsegv library. We will show how to install these libraries below.

Gawk has built in functionality to load extensions dynamically. The XML extension is clearly meant to use this functionality, so our aim is to create a dynamic link library for the extension. It appears that the supplied source files for “pc” are not yet prepared for that. Most modifications proposed on this page are related to the connection between the gawk executable and the xml etension and to loading the extension dynamically. We will show a number of modifications to the gawk source, the source for the XML extension and to the build script of the extension. We will go through all required modifications in the sections Building gawk.exe and Building the xml extension.

All modifications to the original XMLgawk source files can be found in the Downloads section.

Preparing the environment

Standard MinGW install

We used the TDM-GCC build of MinGW to set up our MinGW environment, see the TDM site. We created the directory c:\Programs\MinGW and installed there. By installing the package we obtain:

together with the associated dev packages. The archives should be unpacked to the MinGW directory. These libraries are needed according to gcc-4.5.0-1-mingw32.RELEASE_NOTES-1.txt. You can obtain the files from the MinGW archive at Sourceforge.

Gawk can use the functionality of the sigsegv library. This functionality is to signal to the application (gawk in our case) when it makes an invalid memory reference. If the library is available, the functions of sigsegv are statically built into the gawk executable.

If the sigsegv fuctionality is wanted, building gawk needs the library files of sigsegv. Source files for the sigsegv library are available in the xgawk package, but I was not able to build the library files from that. I, therefore, searched a source archive, that included more recent files for sigsegv. Gawk-3.1.7 appeared to be useful.

We download the archive gawk-3.1.7.tar.gz which is available at GNU gawk. Unpack the archive to a suitable directory, for example c:\Programs\gawk-3.1.7. We start MSYS and cd to gawk-3.1.7\libsigsegv. We subsequencely give the commands:

./configure --host=i386-pc-mingw32
make
make check

The last command should give:

==================
All 5 tests passed
==================

A succesful build gives us the header file sigsegv.h and the library files:

libsigsegv.a, libsigsegv.la and libsigsegv.lai

The header file is created in gawk-3.1.7\libsigsegv\src; we copy it to MinGW\include. The library files are in gawk-3.1.7\libsigsegv\src\.libs; we copy them to MinGW\lib.

Building gawk.exe

The sourcefiles for XMLgawk can be obtained from the XMLgawk home page, current link “second release candidate”. We obtain the archive xgawk-3.1.6a-20090408.tar.gz. We unpack the xgawk archive to C:\Programs\xgawk-3.1.6.

Edit pc\Makefile

We change prefix = c:/gnu to prefix = c:/Programs/MinGW.

In accordance with the instructions in the file README-d\README.pcdynamic, particularly in the part after “—” we make the following changes to obtain dynamic linking:

Finally, we removed “xml_puller$O” from the macros AWKOBJS1 and PAWKOBJS1. This, since we do not want the xml_puller extension to be statically linked into gawk.exe. The modified macro definitions become:

We insert this code after the definition of DEFPATH. This is an omission in the code for pc: deflibpath is not defined without this addition. We add the definition to gawkmisc.pc, in agreement with the code in the posix variants: there the variable is defined in posix\gawkmisc.c.

Edit pc\config.h

Add the definition of SHLIBEXT at the end of config.h:

#define SHLIBEXT "dll"

Copy “pc files” and build

Copy the files (exept Changelog) from the pc sub directory to the xgawk directory. Run MSYS, go to the xgawk directory and run

make mingw32

This gives gawk.exe and pgawk.exe. The file Build log gawk shows the output generated by make.

Build the XML extension

As said in the intro, our aim is to build a dynamic link library. This is not straightforeward, but fortunately the file README-d\README.pcdynamic comes to the rescue. Particularly the part after “—“. The source files for the extension are in the sub directory extension. We cd to this sub dir.

Edit extension\Makefile

We need to create a specific makefile to build the XML extension. As a starting point I used the supplied makefile Makefile.pc and copied it to Makefile.
Since the xml extention uses the Expat and iconv libraries, we need to link to these. We therefore change the line
MWLDFLAGS=-s -Wl,–enable-stdcall-fixup -L.. -lgawk to

MWLDFLAGS=-s -Wl,--enable-stdcall-fixup -L.. -lgawk -lexpat -liconv

A number of other changes are needed to the makefile. The modified file is given here.

Run extension\xml-conv-enc

We start MSYS, cd to extension and run xml-conv-enc. This creates the files xml_enc_registry.inc and xml_enc_tables.inc. These are required by xml_enc_handler.c.
(xml_enc_registry.inc also provides the required array encs[].)

Edit extension\xml_interface.c

The header langinfo.h is used by xml_interface.c. As far as I know, MinGW does not have an implementation for langinfo.h. As a solution we comment out the line #include &lt;langinfo.h&gt; and change the line char *charset = nl_langinfo(CODESET); to

char *charset = "";

This seems quite harmless as the possibility is also metioned by the author of xml_interface.c.

First build of the xml extension

From the earlier quoted README-d\README.pcdynamic file, we can expect that building the extension will give some errors the first time. This since the extension needs a number of functions and variables from the gawk.exe that are not exported by gawk yet. Indeed running

The variables and functions that give errors need to be exported by gawk. This can be accomplished via the gawkw32.def file, as explained in the PC Dynamics file.
We, therefore, edit gawkw32.def and add the variables and functions to be exported. The ones needed are obtained from the errors during a first compilation of xml.dll. We add

This is probably not very useful on other systems than mine, I am afraid.

Downloads

In the archive below you will find the modified source files for XMLgawk, containing all modifications as described on this page. The files are meant as replacements for the corresponding source files from xgawk-3.1.6a-20090408.tar.gz as found on the XMLgawk home page. Hence, the archive needs to be unpacked to the same directory where you unpacked the original XMLgawk archive, overwritting the corresponding original files.