XSH: Interactively Manipulate and Analyze XML Data : Page 2

Most developers use some kind of XSLT engine to pick out and process data from structured XML files. Learn how XSH, an open source command-line XML shell, lets you interactively query and manipulate this data without the coding overhead.

WEBINAR:

On-Demand

Basic Query Commands
As you may have surmised, the ls command in XSH displays a section of the XML tree. The ls command may remind you of the Unix ls command, which also displays a section of a treethe directory tree of the filesystem. The similarity is not coincidental. The XSH navigation commands are modeled on Unix filesystem navigation commands (see Table 1).

Table 1. Unix Filesystem Navigation Commands

Command

Description

cd

Change the current context node* (current position in the XML tree).

ls

List the XML of the current position (or a specified node).

pwd

Show the location of the current context node.

* A node is a tag, attribute, or textual content inside a tag.

Most of the time you will use the ls command in conjunction with an XPath specifier for querying. The following is an example terminal session using the commands above (The yellow highlighted text shows the commands you must type.):

[wchao@excalibur xsh_article_sample_code]$ xsh
------------------------------------------------------------
xsh - XML Editing Shell version 2.0.2/0.12 (Revision: 2.2)
------------------------------------------------------------
Copyright (c) 2002 Petr Pajas.
This is free software, you may use it and distribute it under
either the GNU GPL Version 2, or under the Perl Artistic License.
Using terminal type: Term::ReadLine::Perl
Hint: Type `help' or `help | less' to get more help.
$scratch/> open "example1.xml"
parsing example1.xml
done.
/> cd /books/book[contains(title, "Wild")]
/books/book[4]> ls
<book>
<title>...</title>
<author>...</author>
<publisher>...</publisher>
<publication-date>...</publication-date>
<chapters>...</chapters>
</book>
Found 1 node(s).
/books/book[4]> pwd
/books/book[4]
/books/book[4]> ls chapters/chapter[title="Into the Primitive"]
<chapter>
<title>Into the Primitive</title>
<content>
"Old longings nomadic leap,
Chafing at custom's chain;
Again from its brumal sleep
Wakens the ferine strain."
Buck did not read the newspapers, or he would have known that
trouble was brewing, not alone for himself, but for every tide-
water dog, strong of muscle and with warm, long hair, from Puget
Sound to San Diego. Because men, groping in the Arctic darkness,
had found a yellow metal, and because steamship and transportation
companies were booming the find, thousands of men were rushing
into the Northland. These men wanted dogs, and the dogs they
wanted were heavy dogs, with strong muscles by which to toil, and
furry coats to protect them from the frost.
</content>
</chapter>
Found 1 node(s).
/books/book[4]>

Manipulating Information
One of the highly useful features of XSH is the ability to change the XML. When XSH loads an XML file, it constructs an in-memory DOM tree that you can modify. Table 2 lists the commonly used manipulation commands.

Table 2. Commonly Used Manipulation Commands

Command

Description

copy

Copy one or more nodes from a source to a destination (both XPath). This copies each source node to the corresponding destination node, where "corresponding" means the destination node in the same position in the parameter list as the source node. This means if you copy nodes A and B before nodes C and D, A will go before C and B will go before D.

xcopy

Cross-copy nodes from a source to a destination. This differs from regular copy because it copies every source node to every destination node, resulting in x * y nodes if there are x source nodes and y destination nodes.

insert

Insert a new node of a given type. You must specify the type, which can be: element, attribute, text, cdata, comment, chunk, or entity_reference.

move

Move nodes from one place to another. This is the same as a copy followed by a remove.

rename

Rename a node.

map

Map an expression or short operation onto a list of nodes.

remove

Remove one or more nodes.

xinsert

Cross-insert nodes to one or more destination nodes. This is the "x" version of insert, analogous in operation to how xcopy differs from copy.

xmove

xcopy followed by remove.

The copy, xcopy, move, xmove, insert, and xinsert commands have a location parameter that specifies where the source nodes go in relation to the destination nodes. Table 3 lists the possible choices for location.

Table 3. Possible Choices for Location Parameters

Location

Description

after

Place source nodes after the destination nodes. Most of the use cases are obvious. If both source and destination nodes are attributes, XSH attaches the source node to the parent element of the destination attribute. If the source attribute is not an attribute, but the destination node is an attribute, then the text of the source attribute is simply appended to the value of the destination attribute.

before

Place source nodes before the destination nodes. The behavior is analogous to the after location, except in the preceding position rather than the following position.

into

Place source nodes into the destination nodes. If the destination nodes are of type element, the source nodes become children of the element (unless the source node is of type attribute, in which case the source node becomes an attribute of the destination node). Otherwise, the value of the destination node gets set to the source node.

append

Append a source node to a destination node. If the destination node is of type element or document, then the source node is added as a child of the destination node. Otherwise, XSH appends the textual content of the source node to the content of the destination node.

prepend

Place a source node before a destination node. Same as append, except in the preceding position rather than the following position. For children, prepend starts from the first child and bumps all the other children forward.

replace

Replace the entire destination node with the source node, except when the destination node is an attribute, in which case only the value of the destination node (the textual content) is replaced with the textual content of the source node.

The insert command lets you insert a new node. Table 4 lists the node types and a description of each.

Generally speaking, you are going to use and encounter nodes mostly of type element, attribute, and text. Listing 1 shows examples of all of the commands in Table 2 (except move and xmove) in a single terminal session using XSH. The move and xmove commands are built on copy and remove, so they are self-explanatory. The yellow highlighted text shows the commands you must type. The green highlighted text shows the changes or lines of note.

Listing 1. Commonly Used Manipulation Commands (except move and xmove) in a Single Terminal Session Using XSH

Once you are done manipulating information, you may want to save your new XML tree. Use the save command. If you do not specify any parameters to the save command, XSH will overwrite your old XML file. If you want to save the XML tree to a new XML file, specify the --file parameter, like so:

save --file new_filename.xml

(Save the file now so that you can open it from a known state for the following section on Perl.)

Try different variations on the manipulation commands. The beauty of an interactive tool is that you can make changes and try different operations. The feedback is immediate, so you can quickly figure out how things work and equally quickly achieve the results you want on your data.

Advertiser Disclosure:
Some of the products that appear on this site are from companies from which QuinStreet receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. QuinStreet does not include all companies or all types of products available in the marketplace.

Thanks for your registration, follow us on our social networks to keep up-to-date