This week I needed to parse a massive XML file. In the corporate world, XML is the defacto standard for systems to communicate with each other. In C#, we have a few ways to parse XML the easiest being using LINQ to XML or creating a class and using the XmlSerializer to parse the XML. C# has another way to read XML using Xpath but in the .NET world it is far simpler to just use the aforementioned techniques. In Ruby you just need to head to The Ruby Toolbox, search for XML and be overwhelmed by the different libraries.

Look at that, nothing like some XML! Our requirement here is to pull out the shipping address and display the name, street, city, state, zip and country. I’m going to use LINQ to XML in this example because it is the closest to what you would do in Ruby.

Looking at the code I am quite happy at the result. We are obviously using LINQ to XML, so we need the two references to System.Linq and System.Linq.Xml. We first load the ‘purchase_order.xml’ file from disk and read it into a variable called root. Using LINQ we query the root variable’s Elements and get all the ‘Address’ nodes / elements. Once we have all the ‘Address’ nodes, we go ahead and constrain that down to just the Billing address by doing a comparison on the Type attribute to the string value of ‘Billing’. After we have queried our root variable, we can now access the necessary elements to display the name, street, city, state, zip and country values of the address. The code is straightforward and easy to read. I have no complaints about this code, and yes I know there is null handling code present.

Let’s compare this to what is available in Ruby. First we are going to need the Crack gem, so gem install that bad boy or place the following in your gemfile:

gem 'crack'

Now let’s write the code:

So since we are using a third-party gem we need a require to the Crack library, since we are using the XML parsing functionality we also need to require ‘crack/xml’. Now that we are setup with the library necessary to parse XML we can start with our plumbing. We have a ‘purchase_order’ variable that holds the parsed XML document. The cool thing about Crack is that it parses your XML into a Hash, this means everything is available to you via simple name value pairs. To get our billing address we are going to use the Ruby find method, the find method takes a block that will be used to find what you are looking for. We put the billing address we searched for into the billing_address variable. At this point, we can use simple string interpolation to pull out the name, street, city, state, zip and country values. Last but not least we use the puts method to print the details to the screen.

Both languages make parsing XML quite simple. I like the Ruby way, you don’t need to mess about with attributes or elements or any other nonsense, you are given a Hash and iterate or access the values as needed. There you go if you need to parse XML in C# or Ruby or just needed a quick comparison there you go.

Discussion, links, and tweets

My name is Deon Heyns and I am a developer learning things and documenting them in realtime. Python, Ruby, Scala, .NET, and Groovy are all languages I have written code in. I appeared in the New York Post once. I host my code up at GitHub and Bitbucket so have a look at my code, fork it and send those pull requests.