Xponent's Mostly XML Blog

Using XmlSplit In A Batch Process

The XmlSplit Wizard has a Create Script feature that writes the Powershell and Windows Script Host code for executing the
XmlSplit command-line uility. The generated code includes comments and all the user- specified command-line arguments.
The following is an example of the Powershell code.

The above script splits the file C:\Large and Test XML files\books.xml into the C:\Large and Test XML files folder.
But what if you have a number of large XML documents each of which needs to be split? There are currently three ways
to split multiple files with XmlSplit:

1. Use the Wizard to create a separate script for each file. This is time consuming because it requires each file to be
selected in the Files tab, create the script, then repeat for each file. An advantage of this strategy is it offers the
flexibility of changing the split method and/or any of the split options for each file. The separate scripts can later
be copied into a single Powershell script.

2. Create a script for the first file, then copy and paste it for each additional file, changing the name of the file in
the /X argument (the name of the XML file to be split). This strategy will split each file using the same split method
and split options.

3. Create a script for the first file, then encapsulate the script within a ForEach loop. The loop iterates the contents
of a text file that contains the name of each XML document to be split, with one name per line. To illustrate, suppose
we have created such a file list in a file named xmlfilelist.txt. The Powershell script is the same as the one above, but
with the ForEach loop inserted, plus one important modification:

The modification referred to is the change to the /o argument. This argument identfies the base file name used by
XmlSplit for creating the split files. The original value for this argument, C:\Large and Test XML files\test.xml, has
been commented out and replaced with the name of the XML document plus the string 'Split.xml'. This was done to prevent
the split files created by one iteration of the loop from overwriting those created in the previous iteration.

To clarify, the
original argument would create files in the C:\Large and Test XML files folder named test1.xml, test2.xml, test3.xml and
so on. Each iteration of the For loop excecutes a new instance of xmlsplit.exe, so its file counter gets initialzed to
one each time, thereby creating the same named split files and overwriting those created by the previous instance. To
avoid this, a unique base file name is created using the name of the XML document. The string 'Split.xml' is inserted
only to make it easy to identify the split files when viewed in the Windows File Explorer.