Announcing IBM XML4C2

[Dean Roddey]
Ok, now I can actually talk about it...
The new XML4C2 XML parser is now up on Alphaworks. This is its first public
viewing, so bear that in mind; however, its actually quite far along since
it draws on a good bit of previous experience. It is a totally optimized
for C++ implementation which just happens to quite closely mimic the public
APIs of the Java version (the 2.x version I mean), making it pretty easy to
move your exprience between the two. It has SAX and DOM APIs, as well as an
internal event API out of the scanner which can be used as well if you
really need very loseless information. It handles lots of encodings, using
IBM's ICU subsystem.
Like the Java version, flexibility and scalability have been favored over
ultra-blazing speed. But its still a quite reasonable performer.
Particuarly in the e-bidness area it is far quicker for fast up/parser/down
cycles needed by DB stored procedures, and other servery oriented
processing of small transaction type documents. This version only has
support for file:// URLs, with either no host or 'localhost'. That will be
fixed in an upcoming version.
Right now it has pretty much the same license as the Java version and the
source code is not available; however, keep an eye out over the next weeks
for a potentially important change in that area.
So please check it out and provide comments to the address provided in the
docs. We know that there are some problems, but we still want to hear any
feedback since its always possible that we don't about some particular
issue.
Its at: www.alphaworks.ibm.com and should be in the "new stuff" area for a
while since its just arrived there.
----
Date: Wed, 5 May 1999 11:20:53 -0600
From: roddey@us.ibm.com
To: xml-dev@ic.ac.uk
Subject: re: Dreadfully tedious questions
>
>>Expat's been well tested. SP has been even better tested, and unlike
>>Expat, it supports DTD validation; however, SP has a basically
>>undocumented and extremely complicated interface, and it's really a
>>full SGML parser.
>>
>>IBM has a brand-new parser, xml4c++ (I think), at alphaworks.ibm.com.
>>This hasn't had the field testing that Expat and SP have had, but it
>>looks promising.
>
>xml4c++'s disclaimers ("this software sucks" is the general drift, sort of
>like the mozilla disclaimers) are pretty scary. Also, how big is the
>redistributable DLL? (I'm not going to sign their license unless there's a
>chance it'll be small enough to be practical in a plug-in)
>
Just to protected my honor here...
The intent of any disclaimers is not to say that it sucks by any means. This is
the very first public release, so you can obviously expect it to be a little
less stable and mature than something that's been out for years. But its
certainly not Suckware at all. Its actually quite good, since it draws on three
previous parser efforts, and it will definitely get better.
As to the DLL size, that is partly temporary. We depend upon the ICU (IBM
internationalization classes) for our transcoding services. They have not been
ready to release that as a separate product so far so we physically embedded it
into our stuff. However, very soon they will do so and we will split that out.
They will also be better layering their stuff so that we only have to get the
lowest level parts of it, which are all we need. This will reduce our footprint
as well. And, once the ICU is split out, it will be exposed to the client code
so that they can actually do many useful things with this Unicode text that we
are spitting at them. Right now, we don't expose that ICU interface so you have
to provide the tools to do useful stuff with the resulting XML text in its
Unicode format.
So, on both fronts it will be improving. So judge it on its potential (and its
very flexible and extensible architecture gives it a lot of potential) for what
it will be able to do soon. Give us a couple of Alphaworks releases and we will
get it much cleaner and more conformant.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;