(Update: June 23, 2008: I've updated and improved on this technique in this blog post)

Introduction

Blog TOCAnnotations can be used to facilitate transforms of an XML tree.

Some XML documents are "document centric with mixed content." With such documents, you don't necessarily know the shape of child nodes of an element. For instance, a node that contains text may look like this:

[xml]

<text>A phrase with <b>bold</b> and <i>italic</i> text.</text>

For any given text node, there may be any number of child <b> and <i> elements. This approach extends to a number of other situations: i.e. pages that can contain a variety of child elements, such as regular paragraphs, bulleted paragraphs, and bitmaps. Cells in a table may contain text, drop down lists, or bitmaps.

If you want to transform elements in a tree where you don't necessarily know much about the children of the elements that you want to transform, then this approach that uses annotations is an effective approach.

The summary of the approach is:

·First, annotate elements in the tree with a replacement element.

·Second, iterate through the entire tree, creating a new tree where you replace each element with its annotation.

In detail, the approach consists of:

·Execute one or more LINQ to XML queries that return the set of elements that you want to transform from one shape to another. For each element in the query, add a new T:System.Xml.Linq.XElement object as an annotation to the element. This new element will replace the annotated element in the new, transformed tree. This is quite simple code to write, as demonstrated by the example.

·The new element that is added as an annotation can contain new child nodes; it can form a sub-tree with any desired shape.

·There is a special rule: If a child node of the new element is in a different namespace, a namespace that is made up for this purpose (in this example, the namespace is http://www.microsoft.com/LinqToXmlTransform), then that child element is not copied to the new tree. Instead, if the namespace is the above mentioned special namespace, and the local name of the element is ApplyTransforms, then the child nodes of the element in the source tree are iterated, and copied to the new tree (with the exception that annotated child elements are themselves transformed according to these rules).

This is somewhat analogous to the specification of transforms in XSL. The query that selects a set of nodes is analogous to the XPath expression for a template. The code to create the new T:System.Xml.Linq.XElement that is saved as an annotation is analogous to the sequence constructor in XSL, and the ApplyTransforms element is analogous in function to the xsl:apply-templates element in XSL.

One advantage to taking this approach - as you formulate queries, you are always writing queries on the unmodified source tree. You need not worry about how modifications to the tree affect the queries that you are writing.

I’m not exactly sure what you mean – how to retrieve just the nodes you want. The LINQ query that selects nodes can be quite detailed to select a very specific set of nodes to annotate. I’d need a bit more information before I could respond to your question.