High Performance GML to SVG Transformation for the Visual Presentation of
Geographic Data in Web-Based Mapping Systems

Kenneth S.Herdy

Graduate StudentSimon Fraser UniversitySchool of Computing Science

Surrey250-13450 102nd AvenueV3T 0A3Canada

Kenneth S. Herdy completed an Advanced Diploma of Technology in
Geographical Information Systems at the British Columbia Institute of
Technology in 2003 and earned a Bachelor of Science in Computing Science
with a Certificate in Spatial Information Systems at Simon Fraser University
in 2005. He is currently pursuing graduate studies in Computing Science at
Simon Fraser University with industrial scholarship support from the Natural
Sciences and Engineering Research Council of Canada, the Mathematics of
Information Technology and Complex Systems NCE, and the BC Innovation
Council. His research focus is an analysis of the principal techniques that
may be used to improve XML processing performance in the context of the
Geography Markup Language (GML).

David S.Burggraf

Director of Research and DevelopmentGaldos Systems Inc.Research and Development

VancouverSuite 1300-409 Granville StreetV6C 1T2Canada

David S. Burggraf earned his Ph.D. at the University of British Columbia
(Mathematics) in 2003. He is presently the Director of Research and
Development at Galdos Systems Inc. with research interests in the
application of semantic, mathematical and spatial object modeling
techniques to distributed geographic information systems. David is an
active member of the Open Geospatial Consortium (OGC) and is a key
contributor to the development of GeoWeb standards, such as, OGC KML,
GML and GMLJP2.

Robert D.Cameron

Robert D. Cameron earned his Ph.D from the University of British Columbia
in 1983. He is presently a Professor of Computing Science at Simon Fraser
University with research interests in programming languages, software
engineering, data compression, digital libraries and sociotechnical design
of public computing infrastructure. Since completing a six-year term as
Associate Dean of Applied Sciences in 2004, his research focus has been on
high-performance XML processing using the SIMD capabilities of commodity
processors.

Abstract

A performance study considering several alternatives for the visualization of
geographic information (GI) in on-demand web-based mapping systems focused primarily
on server-side generation of SVG from data encoded in Geography Markup Language
(GML). In that context, a detailed performance comparison of GML to SVG
transformation using several XML transformation technologies was carried out,
including Java-based XSLT technologies, direct SAX implementations in both Java and
C++, as well as high performance implementations using the recently released Intel
XML Software Suite 1.0 and the new high performance XML technology based on parallel
bit streams (Parabix). Other alternatives considered in the traditional 3-tier architecture for on-demand
web-based mapping systems include the use of data parallel, AJAX-based
architectures, as an alternative to traditional multi-threaded, server-side
approaches in the generation of SVG map layers. The possibilities of using streaming
SVG technologies and progressive rendering to reduce latency are also investigated.

Whereas XSLT technologies were found to be competitive to direct implementation
with SAX-based implementations within a factor of two, implementation using the
high-performance Parabix framework offered the best performance and were found to
offer a speed-up over worst case XSLT of well over an order of magnitude. This
performance improvement is primarily due to the reduction of XML parsing cost using
parallel bit stream technology, but care is required in using the framework in order
to avoid other bottlenecks in the transformation process. For example, an initial
C-based implementation on top of Parabix was found to have a significant remaining
bottleneck in formatted output, which was eliminated by careful reimplementation.

The client-side translation and scaling features of SVG were found to be of
substantial value in addressing server-side performance. While the initial naive
XSLT implementations performed transformations from world to screen using XSLT
extension functions within the GML to SVG software, it proved possible to avoid this
work by providing the client with the appropriate transformation matrix parameters.
With some GML data sets dominated by long lists of coordinate data, this
optimization proved quite valuable in all implementations of the GML to SVG
transformation benchmarks. Overall, the highly efficient parallel bit stream based
scanning routines of the Parabix engine provided the best performance results in the
parsing of long coordinate lists. No apparent client-side performance degradation
due to client side coordinate transformation was observed on any of the test
platforms.

Although programming within the Parabix framework is similar in nature to
programming under SAX, it would ultimately be desirable to provide high-performance
alternatives using high-level programming paradigms such as that of XSLT.
Implementation of a high-performance XSLT processor using Parabix is an active focus
of our ongoing research. A further research direction worth considering is the
possibility of deploying Parabix on the client-side to provide performance
improvements in SVG rendering performance.

Introduction

The visualization of geographic information is one of the primary goals of on-demand
web-based mapping systems [1]. Web-based mapping
systems commonly encode spatial data with GML for transmission and with SVG for
display [1][2][3]. GML is an XML grammar defined by the Open
Geospatial Consortium (OGC) to encode geographical features [4]. As an XML grammar, GML is platform neutral and is
well suited to facilitate the exchange of spatial data over the Internet. GML however,
is not a visualization format. Rather, GML relies on commercially available viewers for
data visualization, with Scalable Vector Graphics (SVG) viewers being one of the most
common [1]. Large volumes of GML data are
typical in on-demand web-based mapping, and as a consequence, the visualization of GML
as SVG requires high-performance GML to SVG translation.

In general, a three step approach is required to model and display spatial data using
GML and SVG. The following steps describe this process.

Model and persist spatial data.

The modelling and persistence of GML data is not often based on GML feature
collections, but rather geographic features are typically mapped and stored in relational database tables.
WFS (Web Feature Services) feature collections are used to aggregate and encode query results as GML for
transmission to the client.

By convention, many application types such as Open Geospatial Consortium (OGC) Component Web Map
Services (FPS), Map Style Editors, and query results handlers for WFS organize spatial data at the GML document
level as collections of GML features of the same GML feature type. That is, geographic features are first
classified by GML feature type, and then grouped and stored as collections
of GML features within separate GML documents. In general, a set of GML
encoded query results may be obtained from several different network locations.

Transform and assemble a set of source GML documents into a single SVG
document for display.

In general, the transformation and assembly of a set of GML documents into
a single SVG document involves the extraction and translation of GML encoded
spatial data to SVG. Conventionally, each GML data store query result consists of a single GML feature collection, with each feature collection corresponding to
a distinct SVG map layer in the rendering of the final SVG document.

For example, a collection of GML encoded river features may be rendered
after other feature types, such as a surface topography types, or before
additional feature types, such as a bridge or a boats types. Consequently,
the display order of the set of source GML documents must be represented in
the final SVG encoding. In SVG, rendering order is defined by the Painter's
Model, as described in the SVG 1.1 Specification [5]. SVG rendering order follows pre-order document traversal. Overall,
map layer rendering order is application specific.

Parse and render the generated SVG map document.

An SVG viewer parses and renders the SVG map document for display.

XML Transformation Technologies

Several well-known technologies exist to parse and extract the spatial data
contained within GML documents for translation to SVG. Commonly used approaches
include Java or C++ implementations using SAX (Simple API for XML), DOM (Document
Object Model) or pull parser interfaces and declarative implementations using XSLT
(eXtensible Language Stylesheet Transformation) or XQuery processors. For the
traditional programming approaches, we have confined our study to several SAX
alternatives, avoiding the performance impact of tree-building with DOM and choosing
SAX over the similarly performing pull parser model because it is more widely known
and used. In addition, whereas XQuery may be easier to learn than XSLT, GML to SVG
translation represents an XML to XML translation problem commonly solved with XSLT
based software. So, among declarative language approaches we have chosen XSLT over
XQuery.

SAX is a streaming interface [6] that provides
serial access to the contents of an XML document. In general, a SAX parser functions
as a stream parser, which provides an event-driven API to the application developer.
Applications receive information from XML documents in a continuous stream, without
backtracking or navigation [6]. A SAX parser does not
maintain application level parse state context information. Instead, the maintenance
of state information is the responsibility of the application.

SAX has a reputation as an efficient XML parsing model but often requires
additional implementation effort and greater software development skill [6][7]. SAX is not an open standard
and is not portable across programming languages. Despite these limitations, in many
scenarios the efficiency of SAX together with the capability to process large XML
documents in linear time and near-constant memory makes SAX a favored
choice [6].

According to the W3C XSLT 1.0 Specification, XSLT is primarily designed for XML to
XML document transformation [8]. Document Object Model
(DOM) based XSLT processors provide random memory access and maintain parse state
information but typically at the cost of increased memory usage. This additional
memory requirement tends to eliminate DOM-based XSLT processors as a viable
transformation alternative in the processing of large GML documents. Interestingly,
despite its memory requirements, XSLT is commonly presented as the technology of
choice to perform GML to SVG translation [1][2][3][9]. As a declarative language, the appeal of XSLT
may be attributable to a perceived ease of use for non-programmers, or alternatively
from the perspective of system architects, the appeal of XSLT may be enhanced
portability and flexibility offered by open standard compliant XSLT processors.

XSLT Design Patterns

Michael Kay, author of XSLT: Programmer's Reference, describes the
"fill-in-the-blanks" stylesheet pattern as a common XSLT stylesheet design pattern
in which an XSLT stylesheet acts largely as an output template but with the addition
of extra tags used to retrieve and insert variable data at particular points in the
destination document [10]. GML to SVG translation corresponds
to this "fill-in-the-blanks" pattern but with the additional characteristic that the
source GML document is traversed serially and without backtracking.
Kay's "fill-in-the-Blanks" pattern together with serial source document traversal is
straightforward to implement using the SAX event based API. Consequently, despite
minor additional implementation complexity, SAX-based GML to SVG translation
provides a reasonable alternative to XSLT.

The focus of this paper is the evaluation of GML to SVG transformation
performance. Section 2 of this paper describes the GML to SVG translation problem.
Section 3 then moves on to describe the methodology of our performance analysis.
Section 4 presents the performance results. Section 5 provides an analysis of the
performance results and describes data parallel GML to SVG translation with
particular emphasis on system architecture and the reduction of request/response
latency. Section 6 concludes the paper with a summary of the results and directions
for future work.

GML to SVG Document Transformation

GML to SVG document transformation involves the extraction and translation of source
GML encoded features to equivalent destination SVG encodings. A basic understanding of
the GML primitives and their equivalent SVG counterparts is necessary to perform this
translation.

Source GML Document Structure

GML contains a rich set of primitives. The GML feature primitive and the GML
geometries primitives are required to map GML to SVG.

GML Features

In GML, a feature is an application defined object that represents a physical
entity such as a bridge, river, or road [11]. In general,
GML models real world concepts as geographic features,
which are delivered as feature collections by feature services and organized by component web map services into feature layers, commonly referred to in
the mapping world as GML
feature layers. Each GML feature can contain multiple GML geometries. A set of
transformed GML feature layers comprise the layers of an SVG encoded map.

GML Geometries

GML encodes the spatial properties of geographic objects as geometry elements
within GML documents. Basic point, line, and polygon geometries can be encoded
using any of the following GML geometry elements respectively.

Destination SVG Document Structure

SVG is an XML grammar used to encode 2-dimensional vector graphics. In the translation
of GML to SVG, a destination SVG document is generated which contains a root SVG
element. This root element in turn contains one or more group elements. Each group
element corresponds to a distinct GML layer and contains the necessary information
to render the set of spatial features contained within that layer. Layer specific
styling rules are applied to each SVG map layer.

The following source GML document values are sufficient to generate a
corresponding SVG destination document.

GML bounding box coordinate pair values.

Minimum and maximum bounding box coordinate pair values facilitate the
transformation of a GML world coordinate system to an equivalent SVG
screen coordinate system.

Geometry object coordinate data is modified and assigned to the data
or 'd' attribute of the corresponding SVG path element. This process
involves the translation of GML encoded coordinate data to SVG encoded
path data. SVG coordinate path data must be prepended with a single 'M'
(absolute move to) command letter. If required, a single 'L' (absolute
line to) command letter is also inserted after the first coordinate
pair.

World to Screen Coordinate Reference System Transformation

To achieve world to screen coordinate system translation, a global scaling
operation, followed by a global translation operation is applied to each SVG
feature layer group element. Transformation operations are based on the world
coordinate system bounding box values and the SVG screen coordinate system
bounding box values. The following figure illustrates the GML to SVG, world to
screen coordinates system transformation process in terms of basic scaling and
translation matrix operations. Alternatively these individual transformations
may be combined to yield an equivalent SVG transformation matrix.

The above diagram demonstrates GML world coordinate reference system to SVG
screen coordinate reference system transformation via the SVG transform
attribute. The area represented by the blue rectangle labelled '1', illustrates
a GML region containing a single triangle object expressed with respect to a
world coordinates reference system. In this example, world coordinate y-values increase
upwards and screen coordinates y-values increase downwards. As a consequence, the
triangle appears inverted. Applying a SVG scaling operation produces the blue
area labelled '2'. In this case, GML coordinates are now scaled to the
resolution of SVG screen coordinates with y-values reflected across the x-axis.
The scaled area is then shifted to the gold location labelled '3' via the SVG
translate operation. The gold location represents the viewable on-screen region
of the SVG viewbox.

A Simple GML to SVG Example

The following GML document fragment illustrates the basic structure of a
source GML document. In particular, this GML fragment models a 'bridge' feature
collection.

The following SVG document fragment illustrates the basic structure and
contents of the resultant SVG document in the translation of the source GML
'bridge' document fragment to SVG. Of interest, a reduction in relative document
size is observed due to the relatively flat SVG document structure as compared
to the source GML.

Multiple Source GML Documents

The algorithm to transform GML to SVG does not increase in complexity with the
addition of multiple source GML documents. In the context of a single threaded
process, transforming a set of source GML documents simply lengthens the total
transformation time, with the results of each source GML document transformation
appended to a single destination document. GML to SVG transformation decisions are
simply based on a potentially larger set of GML feature element names, GML geometry
element names and GML coordinates elements names.

In the case that GML source documents are transformed and assembled in parallel,
each GML to SVG transformation thread can be initialized with the minimal
layer-specific transformation information. Multiple source GML document can then be
transformed and appended independently to the resulting SVG document tree. The
completion of the final transformation marks the completion of the overall
transformation process.

The assembly of the final SVG map document must follow the global logical
rendering order of all transformed source GML documents. This ordering is expressed
in the final SVG document structure. The following figure demonstrates the
relationship between global GML layer order and the SVG document structure. Lower
logically Z-indexed GML layers are located earlier in the SVG tree structure with
respect to a pre-order traversal and rendered prior to higher Z-indexed layers.

Figure 2. GML to SVG Translation with Multiple Source Documents

The structure of the SVG document corresponding to the above SVG tree structure is
illustrated with the following abbreviated SVG document fragment.

Performance Evaluation

In this section we present a performance evaluation of a wide spectrum of GML to SVG
translation transformation technologies. If available, the SAX 2.0 API was selected in
the evaluation of each of the SAX-based parsers. Direct SAX-based implementations
include the following candidates.

Parabix

Parabix is an open-source XML processing technology that uses a fundamentally
new way to perform high-speed parsing of XML documents [16]. Parabix leverages the SIMD (Single Instruction,
Multiple Data) capabilities of commodity processors to deliver dramatic
performance improvements over traditional byte-at-a-time parsing technologies.
Byte-oriented character data is first transformed to a set of 8 parallel bit
streams with each stream comprising one bit per character code unit. Critical
XML parsing operations are then carried out in parallel using bitwise logic and
shifting operations. Traditional byte-at-a-time scanning loops are replaced with
bit scan operations. Each bit scan operation can potentially advance by as many
as 64 byte positions with a single instruction [21]. Since the core bitstream algorithms of
Parabix are expected to be highly parallelizable, future directions for the
Parabix engine includes work on leveraging the performance benefits of parallel
processing on multicore technology [21].

To further leverage the high performance bit scan operations of Parabix, the
Parabix engine provided a pull-based GML coordinate conversion method. This
pull-based method allows an application to advance the underlying Parabix
parsing engine directly and leverages the underlying bit scan operations of the
engine.

For the purpose of this performance study, the GML to SVG benchmark
implementations based on the Parabix pull parsing feature is known as Parabix
ILAX (Pull). The standard serial access and event-based Parabix implementation,
equivalent to each of the SAX benchmarks, is described simply as Parabix ILAX.
ILAX is an acronym which stands for In-Line API for XML and is functionally
equivalent to a SAX event-based API in which an application registers event
handlers at compile time.

Test Environment

Performance experiments were conducted on a 2.128 GHz Intel Core 2 Duo processor
desktop machine with 2 GB of available memory. The following tables describe the
hardware and software configuration of the performance evaluation environment. The
Performance Application Programming Interface (PAPI) Version 3.5.0 toolkit [20] was installed on the test system to facilitate the collection
of hardware based CPU cycles. On Linux, PAPI accesses these hardware counters
through the Linux x86 Performance Monitoring Counters Driver known as
perfctr [22].

Hardware Events

The key hardware event evaluated in this performance analysis is processor cycles.
This metric is reported as the number of processor cycles per source GML byte. The
PAPI facilitated the collection of hardware cycle data directly via the PAPI C
API [20]. A JNI wrapper to the PAPI C API enabled the
collection of hardware cycle counts for the Java implementations. Performance
results are adjusted to account for additional cycle overhead as a result of
performance monitoring instrumentation and specifically for the effects of JNI
function calls crossing the Java/C boundary as well as.

Benchmark Data Characteristics

GML to SVG data translations are executed on GML source data modelling the city of
Vancouver, British Columbia, Canada. This data set consists of 46 distinct GML
feature layers ranging in size from approximately 1.5 KB to 12 MB. In this
performance study, approximately 21 MB of source GML data generates approximately
8.8 MB of destination SVG data.

In external conversion function based benchmarks, GML coordinate data size impacts
overall GML to SVG transformation performance. In this data set, water body features
layers, such as the Ocean and Lake GML layers, contain of a relatively small number
of geometry objects (polygons) but each geometry object contains a large volume of
coordinate data. In contrast, the roads layers, RL1U to RP6U contain a large number
of geometry objects (line segments), but each geometry objects contains relatively
few coordinate data pairs.

GML to SVG Benchmarks

In each benchmark, GML feature elements and GML geometry elements tags are
matched. GML coordinate data are then extracted and transformed to the SVG path data
encodings. Equivalent SVG path elements are generated and output to the destination
SVG document.

XSLT Benchmarks

The following pair of XSLT stylesheet fragments demonstrate the per GML
feature layer logic necessary to translate source GML to SVG. The XSLT fragment
presented below illustrates world to screen coordination reference system
conversion through SVG transform attribute parameterization.

All geometry objects are expressed in the same coordinate
reference system.

These simplifications ease the implementation effort but do not alter relative
benchmark performance.

SVG Document Assembly and Rendering

The following figure displays the desired GML to SVG translation result of translating the GML data set representing the city of Vancouver, BC, Canada.
Application specific layer styling information is applied to each SVG map layer.

Figure 3. SVG Map of Vancouver, BC, Canada

Performance Results

This section presents the benchmarking methodology and GML to SVG performance
results. Of interest, the Intel XSLT Accelerator is configurable to allow the
transformation of an XML document using multiple threads. This feature is disabled
by default but is configurable to allow for parallel transformations through
configuration of a maximum number of parallel threads. GML to SVG experiments with
settings of two, four and eight parallel threads respectively produced a best case
improvement of approximately 2 cycles per source GML byte overall. The following
figures present default configuration Intel XML Software Suite results.

Benchmarking Methodology

The PAPI toolkit requires the direct instrumentation of source code. In this
performance study, the source code of each benchmark is instrumented to measure
the costs of in-memory GML to SVG translation. At the completion of each
benchmark invocation, translation results are written to file for the purpose of
validation. SVG results are verified visually and on the basis of expected SVG
byte count. The cost of file I/O is not included in the performance results.
Although PAPI instrumentation overhead is insignificant relative to the overall
transformation costs, PAPI instrumentation overhead is subtracted from the
processor cycle counts.

Each of the C/C++ based benchmarks produced consistent and repeatable results.
However, Java-based GML to SVG benchmarks demonstrated significant variation as
a consequence of Java Virtual Machine run-time features, such as Just-In-Time
(JIT) compilation and dynamic garbage collection [23].
To obtain consistent and repeatable Java-based benchmark results, a significant
warm up period of one thousand translation iterations was required prior to the
collection of performance data. Experimentation indicated that one thousand
iterations provided a sufficient execution time to eliminate the effects of JIT
and produce consistent and repeatable results. Further, reported performance
metrics reflect average values calculated over one thousand benchmark iterations
executed after the completion of the warm up period.

Performance Results Relying on the SVG Transform Attribute

The performance results presented in the following figure demonstrate the
processor cycle per source GML byte cost of translating the complete GML
benchmark data set to SVG. These results illustrate the case in which the SVG
transform attribute is used to convert the GML world coordinates system to an
equivalent SVG screen coordinates system. As previously described, GML
coordinate data values are first manipulated by prepending GML coordinate data
strings with a single 'M' command letter, and if required, a single 'L' command
letter is inserted after the first coordinate pair. Beyond this lightweight
manipulation, source GML coordinate data is not directly altered by the
benchmark applications. In addition, to achieve equivalent visual presentation
results as the external function based GML to SVG transformation benchmarks, an
inverse scaling is applied to each scale dependent style attribute values such
as stroke-width CSS property. Since SVG styles are applied on a per GML source
document basis, the additional cost of inverse CSS property scaling is not
significant.

Parabix ILAX and the Intel Software Suite for C++ demonstrate similar levels
of performance, requiring approximately only 21 cycles and 25 cycles per source
GML byte respectively. Both Parabix and the Intel Software Suite for C++
dramatically outperform the Xerces-C parser, with each completing the
transformation task over 5 times faster than this traditional byte-at-a-time,
single core XML parsing technology.

Surprisingly, the Intel XSLT Accelerator for Java Environments outperforms
each of the JAXP SAX implementation evaluated. These performance results
indicate that the Intel XSLT Accelerator for Java Environments is over 3.5 times
faster than the Intel XML Software Suite for Java, Xerces-J and Crimson. The
inferior performance of the Intel XML Software Suite for Java (JAXP SAX API)
implementation versus the Intel XML Software Suite JAXP XSLT is particularly
unexpected. This performance discrepancy may be attributable to a requirement to
more frequently cross the Java/C boundary under a JAXP SAX processing model. In
a SAX event-based approach, a callback is required for each registered XML
parsing event. In the case of the Intel XML Software Suite, each callback
requires crossing the Java/C boundary via JNI. In this case, the additional cost
of JNI may limit the overall performance benefits of the Intel XML core. Similar
reasoning explains the relative superior performance of the Intel XSLT based
performance. In this case, the overall impact of JNI overhead may be
significantly less for the simple reason that an XSLT processor is not required
to a generate an event callback per XML parsing event. As a result, the XSLT
processor is not required cross the Java/C boundary as frequently.

Overall, Parabix ILAX (Pull) is over 4 times faster than the Intel XML
Software Suite JAXP XSLT Templates implementation and of well over an order of
magnitude faster than the Saxon JAXP Templates implementation. In addition, in
comparison to each of the Java-based SAX implementation evaluated, Parabix ILAX
(Pull) is again well over an order of magnitude faster.

Performance Results Relying on the External Library Functions

For the purpose of comparison, the performance results presented in the
following figure demonstrate the processor cycle per source GML byte cost of
translating the complete GML benchmark data set to SVG. These results represent
the translation scenario in which external C and Java functions are used to
explicitly convert GML world coordinate reference system data to SVG screen
coordinate reference system data. This conversion process requires the GML
coordinate string tokenization, string to numeric data type conversion, explicit
transformation of GML coordinate data, and numeric to string data type
conversion. At minimum, the actual numeric conversion requires the inversion of
the Y-axis. That is, each source GML Y coordinate value must be scaled by a
factor of negative one.

A comparison of the SVG transform attribute based scenario versus the external
function based transformation scenario performance reveals that each of the C++
based SAX API implementations incur an additional cost of approximately 90
cycles per source GML byte. This cost is due to the external C library
conversion functions costs. A comparison of the JAXP SAX API implementations
indicates that external coordinate data conversion adds approximately 300 cycles
per source byte to the overall cost. In the case of the JAXP XSLT
implementations, external functions add approximately 450 cycles per source GML
byte overhead.

The C-based GML coordinates conversion methods are based on the string to
double (strrod) C library routine. This C-based approach provides a relatively
efficient means to complete the GML coordinate data tokenization and has the
added benefit of simultaneous string to numeric data type conversion. In
contrast, an examination of the Java library functions revealed the high cost
the Java StringTokenizer object. The Java StringTokenizer allocates a new Java
String object for each GML coordinate value. This small object memory allocation
occurs with each call to the nextToken method of the Java StringTokenizer class.
In addition, a floating point Java object is created in the conversion of each
coordinate Java String type to the Java BigDecimal type. Excessive small object
creation is a well know performance bottleneck and explains the poor Java versus
C based performance.

Both the JAXP SAX API and the JAXP XSLT implementations rely on the same set
of external Java library conversion methods. However, the JAXP XSLT external
functions incurred an addition 150 cycles per source GML byte cost. This
additional XSLT processing cost is attributable to the overhead of external
function calls in XSLT.

Unfortunately the IntelXML Software Suite generates signal 11 errors with the
use of the Java extension functions of this performance study. This runtime
error prevented the inclusion of Intel XSLT Accelerator extension function based
GML to SVG translation
results.
In addition, a function to performs explicit GML to SVG coordinates data
conversion for the Parabix ILAX (Pull) implementation was not developed for this
performance study and consequently performance results for the Parabix ILAX
(Pull) implementation are also not presented in the following figure.

High Proportions of GML Coordinate Data

The investigation of GML to SVG transformation performance was motivated in
part by the claims of the authors of the SVG Open 2003 paper entitled, "SVG
Explorer of GML Data" [24]. In this paper, the authors claim low XSLT performance in
the case of geographical elements with large numbers of coordinates. An further
investigation of this claim was conducted in the context of GML to SVG
transformation based on both external XSLT extension functions and the SVG
transform attribute.

A linear regression analysis of the proportion GML documents coordinate data
versus GML to SVG translation cycles per byte for the Saxon XSLT JAXP
Transformers benchmark based on XSLT extension functions exhibited a correlation
coefficient value of 0.79. In contrast, the Saxon XSLT JAXP Transformers
benchmark based on the SVG transform attribute exhibited a correlation
coefficient value of -0.44. Similar results were obtained for each of the other
technologies. Since, GML coordinate data extension function based conversion is
a relatively expensive operation, this analysis confirms that GML to SVG
translation with large volumes of coordinate significantly decreases overall
transformation performance. Further, the elimination of explicit server-side
coordinate data conversion removes this performance bottleneck.

Parabix ILAX (Pull) Raw per GML Feature Layer Performance

The optimal case of the Parabix ILAX (Pull) SVG Transform attribute based
implementation is demonstrated in the following figure. This figure shows the
absolute total cycles required to independently transform each source GML
document to SVG. Referencing the above data table illustrates the relationship
between increasing source GML document size and processor cycles. Not
surprisingly, larger source documents require a linear increase in additional
cycles to complete. Of interest, this figure demonstrates the large degree of
variability which may exists within a GML dataset in terms of individual layer
transformation times with this dataset dominated by the RP2U roads feature
layer.

Discussion

The most important performance criterion for interactive applications is
responsiveness (latency). Latency determines the performance perceived by the end user.
The possibilities of using streaming SVG technologies and progressive rendering together
with parallel transformation provides a further direction to reduce system latency in
on-demand web-based mapping systems.

Data Parallelism

Data parallelism is a form of parallelism in which the same transformation is
applied to each piece of data. In the case of GML to SVG translation, data
parallelism is exhibited at the GML document level and the GML feature level. As
demonstrated in the following figure it is natural organize, transform and assemble
source GML data in parallel at the GML document and without explicit
synchronization. As a result, low parallel transformation overhead is present.
Consequently, it is not a question of whether to parallelize GML to SVG translation
but rather whether to locate the GML to SVG parallel translation logic at the
server-side or client-side.

Figure 7. Data Parallel GML to SVG Transformation and Aggregation

In a traditional, server-side threading approach, an individual thread is
instantiated to translate each source GML document request to SVG. The complete
document is assembled and then transmitted back to the client. In general this may
result in additional request/response latency as the overall GML to SVG
transformation performance is dominated by the slowest GML layer. In addition, this
approach eliminates the potential of per GML source layer progressive rendering.

In an AJAX based approach, multiple simultaneous client-side layer requests are
issued to the server by the SVG client viewer. Client-side rendering logic is then
able to progressively render individual map layers transmitted back to the client as
SVG results become available for display and with respect to the map layers
rendering order. GML document level client-side progressive rendering is
straightforward to implement at the client. Nevertheless, large source GML layers
have the potential to bottleneck client-side layer rendering. Additional techniques
such as compression, line generalization and the splitting of large GML documents
are also necessary to reduce request/response latency.

As mentioned, data parallelism also exists at the GML feature level. However,
parallel processing at the GML feature level may introduce high threading overhead.
Instead, experimentation with the SVG 1.2 Progressive Rendering demonstrates the
potential of smoother layer rendering and a further reduction in request/response
latency.

Conclusion

The open source and high performance Parabix technology offers the prospect of
dramatic performance improvement in XML to XML transformations applications. As
illustrated by the GML to SVG transformation benchmark analysis presented, Parabix
delivers demonstrable superior XML processing performance.

The streaming event-based approaches offered by the Parabix processor and the Intel
XML Software Suite for C++ SAX API offer the ability to process large documents
efficiently and avoids the construction of an underlying DOM reducing request/response
latency. Consequently, despite some additional implementation effort, SAX-based GML to
SVG implementation provides a simple and high performance alternative to XSLT.

From an architectural perspective it would ultimately be desirable to provide
high-performance alternatives using high-level programming paradigms such as that of
XSLT. The Intel XSLT Accelerator for Java Environments demonstrates improved processing
performance in this area, outperforming both traditional JAXP SAX and XSLT
implementations. Implementation of a high-performance XSLT processor using Parabix is
also an active focus of our ongoing research. A further research direction worth
considering is the possibility of deploying Parabix on the client-side to provide
performance improvements in SVG rendering performance.

Acknowledgements

This work was supported in part by an Industrial Post Graduate scholarship provided by
the Natural Sciences and Engineering Research Council and the Mathematics of Information
Technology and Complex Systems of Canada. Additional support was supplied by the British Columbia Innovation Council
via a British Columbia Industrial Innovation Scholarhip. GML Vancouver data set resources were provided by Galdos Systems Inc.