XML Strengths and Weaknesses with DOM, ASP and XSL

Since the inception of XML, many developers have wondered why they need XML and how they can use XML to their benefit. In this article, Nakul looks at some of the terminology that comes with using XML and its related technologies, as well as how to create and transform XML documents with ASP and XSL

Introduction

Since the inception of XML, many developers have wondered why we need XML... How is it better than HTML and what does it do? For starters, XML is far more powerful than HTML, and the power resides in the "X" in XML (which stands for extensible). Rather than providing a set of pre-defined tags (as in the case of HTML), XML specifies the standards with which you can define your own markup languages with their own sets of tags. XML is therefore a meta-markup language, allowing you to define an infinite number of markup languages based upon the standards defined by XML.

XML was created so that richly structured documents could be used over the web. The only viable alternatives, HTML and SGML, are not practical for this purpose. XML allows you to define all sorts of tags with all sorts of rules, such as tags representing business rules or tags representing data description or data relationships.

In this article, we're going to take a look at some of the terminology that comes with using XML and its related technologies, as well as how to create and transform XML documents with XSL using Microsoft's MSXML parser. To test the code samples shown in this article, you should be running Windows NT/2000/XP with IIS installed. You should also have SQL server 2000 installed on the same machine.

XML definitions

As with any technology, XML has its own acronym-riddled lingo. Some of the important acronyms include:

DTD: In XML, the definition of a valid markup is handled by a Document Type Definition (DTD), which communicates the structure of the markup language. The DTD specifies the validity of each tag.

XSL: The Extensible Style Language (XSL) is the style language for XML that allows us to transform XML nodes using a set of patterns and templates.

XML Pointer Language (XPointer) and XML Linking Language (XLink): These two technologies define a standard way to represent links between resources. In addition to simple links like HTML's <a> tag, XML has mechanisms for linking between multiple resources and linking between read-only resources. XPointer describes how to address a resource whereas XLink describes how to associate two or more resources.

XML Flow Architecture: XML offers a three-tier architecture. It can be generated from existing databases that employ a 3-tier model themselves. We can maintain business rules separately.

Why XML should be used?

Using XML provides us as developers with a number of benefits. Some of the most obvious benefits include:

Authors and providers can design their documents using XML, instead of being stuck with HTML. They can be explicitly tailored for an audience, so the cumbersome problems with HTML are theoretically eliminated; therefore both authors and designers are free to invent their own markup elements.

Information can be richer and is easier to access and manipulate because the hypertext linking abilities of XML are much more advanced than those found in HTML.

XML can provide more (and improved) facilities for browser presentation and performance.

XML compresses exceedingly well. Since data compression algorithms operate on the concept of maximizing the entropy of a given input stream, it stands to reason that a highly ordered input stream consisting of regular, repeating tag sequences will compress exceedingly well... much better than standard text which contains generally far less order, thus resulting in a decrease in performance.

Weaknesses of XML

XML is obviously not a cure-all language free of any disadvantages... otherwise we would be using XML to markup/represent all of our data, and nothing else! There are of course some drawbacks and weaknesses of XML, namely:

XML markup can be incredibly verbose, depending on the vocabulary in question.

All the pieces of the XML puzzle aren't yet in place, certainly not from a standards-compliant viewpoint anyhow. We've got both XSL and XSLT, however they are not fully developed yet.

There are still some problems with Microsoft's XML parser.

XML Hypertext Transfer Protocol (XML-HTTP) still has some minute problems.

Performance of XML

When you're designing an XML-based web application, what kind of performance hit do you expect to put on your web server? It's hard to generalize because there are so many variables (such as the size of the XML document, the amount of script code required to process the document, the amount of output generated, etc.) to take into consideration, however the following list shows the major variables that can affect the performance of parsing XML:

The kind of XML data being parsed.

The ratio of tags to text.

The ratio of attributes to elements.

The amount of discarded white space in the document.

XML and DOM

Microsoft has provided us with the MSXML parser, which exposes an XML document in the form of a DOM (Document Object Model). With the XML DOM, you can load and parse XML files, gather information about those files, navigate through and manipulate those files. To learn more about the details of the XML DOM, please refer to this site.

Now that we've discussed the reasons for using XML, it's time to look at some source code. We will examine some ASP scripts that create and display XML data. We're going to create an XML file using both static data and data from a database using ADO. The DOM methods createNode and appendChild, as well as the text property are used to construct an in-memory XML tree.

XML with ASP

The following example illustrates how to create an XML tree (in memory) and then persist it to disk using the save method:

In the example above, we create an XMLDOM object. We then create a root node and its child node using the createNode function. Next, we append the nodes after assigning the text property to each of them. Finally, we save the in-memory XML tree to a file, savedI2.xml.

We can also build an XML file from the results of a database query. I've included two files with the support material for this article: pubtest.asp and saved.xsl. Pubtest.asp connects to the SQL Server 2000 pubs database, retrieving several records from the authors table, formatting them as a new XML document and saving that document as saved.xml.

The saved.xsl file contains an XSL style sheet which is used by pubtest.asp to format saved.xml as HTML. You should download the support material before continuing.

Let's now run through the entire process of retrieving data from the pubs database, saving it to an XML file, and loading and transforming this document as XSL with the MSXML parser.

Our XML example explained.

Firstly, we connect to SQL Server 2000 using a system DSN. The DSN is called pubs, and you should create the DSN using the Windows Control Panel. It should connect to your SQL server, more specifically to its pubs database. We instantiate an ADO connection object, passing in the DSN to its open method:

Once we've retrieved each of the records from the recordset and appended them to our XML document, we save the XML document to our local machine using MSXML's save method:

xmldoc.save server.mappath("saved.xml")

We're now at the point where we have an XML file called saved.xml, as well as the style sheet that's included in the support material for this article, called saved.xsl. We instantiate a new XMLDOM object for each of these files, calling the transformNode method of the XML DOM object with a reference to the XML DOM object that contains the XSL file:

Lastly, we use Response.Write to output the transformed XML to the browser:

Response.Write source.transformNode(style)

Conclusion

Well, I hope this article has answered some of your questions on XML. Hopefully you've learned a thing or two about the advantages and disadvantages of using XML, when it can be used, and most importantly, how it can be used.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

Share

About the Author

Nakul Goyal, currently doing Master of Sciences in Information Technology from Panjab University, Chandigarh. A Bachelor of Computer Applications from Punjab Technical University, he is passionate towards the Cyber World & he likes to write about Technology. He's also a Microsoft Certified Professional and a Brainbench Certified 'MVP'(Most Valuable Professional). Also the Co-Founder of CWSTeam (http://www.cwsteam.com). Contact Nakul Goyal by Email: nakul@cwsteam.com