March 1998 Archives

How to Make Perl The Language of Choice for XML

Perl has been the language of choice for anyone doing serious
text processing. Now efforts are underway to make Perl the
language of choice for those doing "structured" text processing
using the Extensible Markup Language (XML).

The XML 1.0 specification was recently (Feb. 10, 1998) released as a recommendation
by the World Wide Web Consortium.
XML is a subset of SGML (Standard Generalized Markup Language) and
it seems to be emerging as a universal syntax for defining
non-proprietary document markup and data formats. XML made
significant changes to SGML to reflect the nature of the Web
and to make it easier to
build tools that process XML.

Tim Bray, co-editor of the XML 1.0 specification, has used Perl extensively
for huge text processing applications. He had a special interest
in seeing a bridge built from Perl to XML -- one that would make
it simple for programmers to process XML data. So, out of
this interest, a small group of developers met at O'Reilly
& Associates in Sebastopol, California for a one-day Perl/XML
summit. In addition to Tim, those attending the summit were:

"In the design of XML, we were continuously mindful of the need to
enable the fast, efficient creation of scripts and programs for
processing XML," says Tim Bray.

"For many of us in the XML effort, the most important goal is to
increase the proportion of the world's documents stored in open,
non-proprietary formats," Bray continues. "Building slick XML
processing into Perl makes the use of such formats more rewarding and
helps frustrate the efforts of those who would imprison human knowledge
behind the barbed-wire of proprietary file formats."

"XML is currently perceived as powerful and important, but not
particularly easy," explains Larry Wall. "This makes XML and Perl
naturally complementary, since Perl is a language that makes easy
things easy to do, and hard things possible."

One of the first steps that the summit group identified was to get Perl working with
Unicode (ISO 10646). Unicode enables code to be easily translated into
other languages; XML requires Unicode. Larry Wall will lead the team
working on this task.

The next step is figuring out at what level Perl actually provides
support for XML: whether it's built into the language, implemented
as a module or a combination of both.

Among the goals set at the meeting were:

Release a Perl/XML spec in Q3, 1998.

Establish a mailing list for discussion of Perl and XML developer
issues.

Develop an XML white paper, co-authored by Larry Wall
and Tim Bray, to be released this
spring.

If you are interested in these efforts, you might want to attend
the Perl and XML session at the XML Developer's Day in Seattle
at the The XML Conference
on March 27, 1998. Larry Wall and Dick Hardt will be speaking.
The Perl Developer Update will also keep track of new announcements.