Simple XML Parsing on WinCE 4.2 Using C++ and MSXML 3.0

This article will show you how to parse XML files on WinCE using MSXML 3.0, a XML parser from Microsoft.

Okay, there are many articles around showing you how to parse XML, but not that many for WinCE using Visual C++. A lot of examples are for the .NET environment using Visual Basic .NET and C#. I'd like to share my experience here with parsing XML on Windows CE 4.2. The classes I present can be run on any device that has WinCE4.2 with MSXML 3.0 installed.

Please note that I am using MSXML 3.0 as of writing this article; this is the latest version supported by WinCE 4.2. WinXP/Win2K supports version 4.0 that includes updated interfaces and supports more XPath expressions.

The prerequisites for this article are that you have some experience with XML and the ways in which it can be manipulated using DOM or SAX. This article uses the former. Also, some experience of COM would be helpful.

A Brief Overview of DOM (Document Object Model)

DOM presents an XML document as a tree structure, just like a file hierarchy in Windows Explorer. The tree has a root node, the root node has parent nodes, the parent nodes have child nodes—I think you get the picture. You can refer to these nodes as elements, with other elements embedded inside them. These elements contain text and attributes that are manipulated through this DOM tree. The contents of these elements can be modified or deleted, and we can create new elements.

MSXML—Microsoft's XML Parser

MSXML is based on COM; it comes with Internet Explorer 5 and above. This component has many functions that help you traverse the XML document, access nodes within it, delete these nodes, update these nodes, insert nodes, and more. It is worth noting that MSXML also supports XSLT and XPath. I won't be using these technologies in this article, but just so you know, these are supported.

That was a brief description of DOM and MSXML. If you need to know more, the Internet is a great resource that has many articles on the aforementioned topics.

Initialising MSXML

Now, on to some code. The first thing you need to do to use MSXML is to initialise it, Remember, I mentioned earlier that this is a COM component, so you first need to initialise COM:

HRESULT hr = ::CoInitializeEx( NULL, COINIT_MULTITHREADED );

This will return S_OK if all is well.

You now need to create an instance of the MSXML COM object. I do the following for this:

hr = m_iXMLDoc.CoCreateInstance( __uuidof( DOMDocument ) );

So, what is m_iXMLDoc I hear you ask? It is of type IXMLDOMDocument and represents the top level of the XML document. It is worth pointing out now that WinCE4.2 only supports MSXML 3.0. If you were using Microsoft Windows XP, for example, this supports MSXML 4.0; thus, we could use IXMLDOMDocument4, which supports other methods.

I have wrapped m_iXMLDoc up into a ATL smart pointer. This is done to avoid having to release the object myself (I may forget!); thus, you have:

CComPtr<IXMLDOMDocument> m_iXMLDoc;

If this function call succeeds, it will return S_OK.

The next bit of code looks quite odd but is needed only if you are using Pocket PC:

This was taken off the Internet. I can't remember from where, so apologies to the person who wrote this, but without it, things don't seem to work; it is needed to mark the MSXML control as safe.

Loading the XML

You now have initialised COM and created the MSXML object; this in turn now lets us use the functionality supplied by the MSXML object. I am now going to load a very basic XML file. It looks like the following:

szXMLFile is the name of the XML file you want to load. Please note that this could easily be a file that resides on a Web server; thus, you could specify a file URL. bSuccess will contain true if all is well.

Now, before moving on to the next bit of code, I need to introduce you to a useful function I wrote:

void CCEXML::DisplayChildren( IXMLDOMElement* pParent )

This function is going to traverse an element/node recursively. It looks like this:

Simple XML Parsing on WinCE 4.2 Using C++ and MSXML 3.0

Manipulating the XML

Up to now, you have created the functionality to traverse the whole XML document, but what about manipulating the document? You don't just want to traverse the document. You actually want to do something with it!

Remember the DisplayChild function I mentioned briefly earlier? For those who have forgotten, it is used within the DisplayChildren function and it is a pure virtual function that needs to be implemented. This function does the manipulation; hence the use of a pure virtual function that allows the user to do what they want with the element passed to it. The following DisplayChild function is a demonstration for the XML I showed you earlier in this article. Because the XML was about customers, I have created a class called CCustomerXML that derives from your main XML class. This is how I manipulate the XML file:

The first check is to see whether this node is actually called "customer." If it is, you start to manipulate the XML. You use another helper function, called ParseXML, here. The function is shown below:

Briefly, what this function does is return the attributes associated with the node as one comma-separated string. In the example, it would return OnLineGolf,OLG. The code then uses another helper function, "GetAttributes," which returns each token within this string. Thus, it would first return "OnLineGolf" and then next time around "OLG."

In my example function, I have a switch statement to know which attribute of the element I am dealing with. Remember, this function is written for the XML supplied, so I know the attributes are "name" and "tag." If I added another attribute to the XML, I would just have another case statement for it. In my example function, I just display the element attribute to the screen, but in a real app, you would store them someplace or do something with them.

Fitting It All Together

I have supplied a base class in this article for download. To use it, you will need to derive another class from it and, as mentioned, implement the DisplayChild function. Using the example I have given in this article should give you a start on how to do this. The base class is called CCEXML, a simple derived class. It would look like the following:

There are a couple of other functions within here that I have not mentioned; these are DASetAttribute and DAAddChild. What these functions do is pretty self-explanatory. The former sets a nodes attribute to a given value, while the latter adds an element to a given node. You should study these functions; they are very useful! Here is a quick example on adding a node and creating an XML file:

You can see that the DAAddChild and DASetAttribute functions are very useful and easy to use.

Summary

I have only scratched the surface of parsing XML with MSXML 3.0 on WinCE. There are lots more you can do with MSXML than I've demonstrated. For example, you may want to use XPath expressions to manipulate your XML. This is a great way to speed up your XML traversing—let XPath do all the work for you. You may want to insert nodes into the DOM tree or transform a node; the list goes on. The best advice here is to look at the help file that comes with MSXML; it is a great reference with examples of how to do this type of thing.

About the Author

Steve Green

I'm a Software Engineer with a company which writes Diagnostic Applications for vehicles.
In my spare time ( what I have of it! ) I love playing Golf and Football and spending as much time as I can with my lovely baby son David.

Top White Papers and Webcasts

When individual departments procure cloud service for their own use, they usually don't consider the hazardous organization-wide implications. Read this paper to learn best practices for setting up an internal, IT-based cloud brokerage function that service the entire organization. Find out how this approach enables you to retain top-down visibility and control of network security and manage the impact of cloud traffic on your WAN.

U.S. companies are desperately trying to recruit and hire skilled software engineers and developers, but there is simply not enough quality talent to go around. Tiempo Development is a nearshore software development company. Our headquarters are in AZ, but we are a pioneer and leader in outsourcing to Mexico, based on our three software development centers there. We have a proven process and we are experts at providing our customers with powerful solutions. We transform ideas into reality.