A pure PERL module to compliment the pure PERL XML::TreePP module. XMLPath may be similar to XPath, and it does attempt to conform to the XPath standard when possible, but it is far from being fully XPath compliant.

Its purpose is to implement an XPath-like accessor methodology to nodes in a XML::TreePP parsed XML Document. In contrast, XPath is an accessor methodology to nodes in an unparsed (or raw) XML Document.

The advantage of using XML::TreePP::XMLPath over any other PERL implementation of XPath is that XML::TreePP::XMLPath is an accessor to XML::TreePP parsed XML Documents. If you are already using XML::TreePP to parse XML, you can use XML::TreePP::XMLPath to access nodes inside that parsed XML Document without having to convert it into a raw XML Document.

As an additional side-benefit, any PERL HASH/ARRY reference data structure can be accessible via the XPath accessor method provided by this module. It does not have to a parsed XML structure. The last example in the SYNOPSIS illustrates this.

Note that attributes are specified in the XMLPath as @attribute_name, but after XML::TreePP::parse() parses the XML Document, the attribute name is identified as -attribute_name in the resulting parsed document. As of version 0.52 this can be changed using the set(attr_prefix='@')> method. It should only be changed if the XML Document is provided as already parsed, and the attributes are represented with a value other than the default. This document uses the default value of - in its examples.

XMLPath requires attributes to be specified as @attribute_name and takes care of the conversion from @ to - behind the scenes when accessing the XML::TreePP parsed XML document.

Child elements on the next level of a parent element are accessible as attributes as attribute_name. This is the same format as @attribute_name except without the @ symbol. Specifying the attribute without an @ symbol identifies the attribute as a child element of the parent element being evaluated.

Child element values are only accessible as CDATA. That is when the element being evaluated is animal, the attribute (or child element) is cat, and the value of the attribute is tiger, it is presented as this:

<jungle>
<animal>
<cat>tiger</cat>
</animal>
</jungle>

The XMLPath used to access the key=value pair of cat=tiger for element animal would be as follows:

jungle/animal[cat='tiger']

And in version 0.52, in this second case, the above XMLPath is still valid:

<jungle>
<animal>
<cat color="black">tiger</cat>
</animal>
</jungle>

In version 0.52, the period (.) is supported as it is in XPath to represent the current context node. As such, the following XMLPaths would also be valid:

One should realize that in these previous two XMLPaths, the element cat is being evaluated, and not the element animal as in the first case. And will be undesirable if you want to evaluate animal for results.

To perform the same evaluation, but return the matching animal node, the following XMLPath can be used:

jungle/animal[cat='tiger']

To evaluate animal and cat, but return the matching cat node, the following XMLPaths can be used:

jungle/animal[cat='tiger']/cat
jungle/animal/cat[.='tiger']

The first path analyzes animal, and the second path analyzes cat. But both matches the same node "<cat color='black>tiger</cat>".

Matching attributes

Prior to version 0.52, attributes could only be used in XMLPath to evaluate an element for a result set. As of version 0.52, attributes can now be matched in XMLPath to return their values.

This module is an extension of the XML::TreePP module. As such, it uses the module in many different methods to parse XML Documents, and when the user calls the set() and get() methods to set and get properties specific to the module.

The XML::TreePP module, however, is only loaded into XML::TreePP::XMLPath when it becomes necessary to perform the previously described requests.

To avoid having this module load the XML::TreePP module, the caller must be sure to avoid the following:

1. Do not call the set() and get() methods to set or get properties specific to XML::TreePP. Doing so will cause this module to load XML::TreePP in order to set or get those properties. In turn, that loaded instance of XML::TreePP is used internally when needed in the future.

2. Do not pass in unparsed XML Documents. The caller would instead want to parse the XML Document with XML::TreePP::parse() before passing it in. Passing in an unparsed XML document causes this module to load XML::TreePP in order to parse it for processing.

Alternately, If the caller has loaded a copy of XML::TreePP, that instance can be assigned to be used by the instance of this module using this method. In doing so, when XML::TreePP is needed, the instance provided is used instead of loading another copy.

Additionally, if this module has loaded an instance of XML::TreePP, this instance can be directly accessed or retrieved through this method.

If you want to only get the internally loaded instance of XML::TreePP, but want to not load a new instance and instead have undef returned if an instance is not already loaded, then use the get() method.

An instance of XML::TreePP that this object should use instead of, when needed, loading its own copy. If not provided, the currently loaded instance is returned. If an instance is not loaded, an instance is loaded and then returned.

returns

Returns the result of setting an instance of XML::TreePP in this object. Or returns the internally loaded instance of XML::TreePP. Or loads a new instance of XML::TreePP and returns it.

$tppx->tpp( new XML::TreePP ); # Sets the XML::TreePP instance to be used by this object
$tppx->tpp(); # Retrieve the currently loaded XML::TreePP instance

Parse a string that represents the XMLPath to a XML element or attribute in a XML::TreePP parsed XML Document.

Note that the XML attributes, known as "@attr" are transformed into "-attr". The preceding (-) minus in place of the (@) at is the recognized format of attributes in the XML::TreePP module.

Being that this is intended to be a submodule of XML::TreePP, the format of '@attr' is converted to '-attr' to conform with how XML::TreePP handles attributes.

See: XML::TreePP->set( attr_prefix => '@' ); for more information. This module supports the default format, '-attr', of attributes. But as of version 0.52 this can be changed by setting this modules 'attr_prefix' property using the set() method in object oriented programming. Example:

XMLPath Filter by index and existence Also, as of version 0.52, there are two additional types of XMLPaths understood.

XMLPath with indexes, which is similar to the way XPath does it

$path = '/books/book[5]';

This defines the fifth book in a list of book elements under the books root. When using this to get the value, the 5th book is returned. When using this to test an element, there must be 5 or more books to return true.

XMLPath by existence, which is similar to the way XPath does it

$path = '/books/book[author]';

This XMLPath represents all book elements under the books root which have 1 or more author child element. It does not evaluate if the element or attribute to evaluate has a value. So it is a test for existence of the element or attribute.

Assemble an ARRAY or HASH ref structure representing an XMLPath. This method can be used to construct an XMLPath array ref that has been parsed by the parseXMLPath method.

Note that the XML attributes can be identified as "-attribute" or "@attribute". When identified as "-attribute', they are transformed into "@attribute" upon assembly. The preceding minus (-) in place of the at (@) is the recognized format of attributes in the XML::TreePP module, though can be changed. See the parseXMLPath method for further information.

This method was added in version 0.70.

parsed-XMLPath

The XML path to be assembled, represented as either an ARRAY or HASH reference.

This optional argument defines the format of the search results to be returned. The default structure is TargetRaw

TargetRaw - Return references to xml document fragments matching the XMLPath filter. If the matching xml document fragment is a string, then the string is returned as a non-reference.

RootMap - Return a Map of the entire xml document, a result set (list) of the definitive XMLPath (mapped from the root) to the found targets, which includes: (1) a reference map from root (/) to all matching child nodes (2) a reference to the xml document from root (/) (3) a list of targets as absolute XMLPath strings for the matching child nodes

ParentMap - Return a Map of the parent nodes to found target nodes in the xml document, which includes: (1) a reference map from each parent node to all matching child nodes (2) a reference to xml document fragments from the parent nodes

An array reference of a hash reference of elements (not attributes) and each elements XMLSubTree, or undef if none found. If the XMLPath points at a multi-valued element, then the subelements of each element at the XMLPath are returned as separate hash references in the returning array reference.

The format of the returning data is the same as the getAttributes() method.

The XMLSubTree is fetched based on the provided XMLPath. Then all elements found under that XMLPath are placed into a referenced hash table to be returned. If an element found has additional XML data under it, it is all returned just as it was provided.

Simply, this strips all XML attributes found at the XMLPath, returning the remaining elements found at that path.

If the XMLPath has no elements under it, then undef is returned instead.

Note that in each example the tokens represent a group of escaped characters which, when analyzed, will be collected as part of an element, but will not be allowed to match any starting or stopping boundry.

So if you have a start token without a stop token, you will get undesired results. This example demonstrate this data error.