I want to go to there.

The bet revolved around a real-world use case (Paul and I both work at Benchmark Solutions, a stealth financial market data startup in NYC).

You can view the data structure at the Offical Fairy-Wing Throwdown Repo™, https://github.com/flavorjones/fairy-wing-throwdown, but the summary is that it’s 54K when serialized as JSON, and is comprised (mostly) of an array of key-value stores (i.e., hashes).

Because I wanted to not just win, but to destroy Paul, I implemented the same parsing task using Nokogiri’s DOM parser, SAX parser, and Reader parser, expecting that code complexity and performance would correlate, somehow. In my mind, the graph looked like this:

But I was shocked and dismayed to see the real results:

What the WHAT?

Yes, that’s right. My payback for increasing the complexity of the code was a reduction in performance. The DOM parser was extremely way faster than either the Reader or SAX parsers.

Chart Notes

The “expected performance” line chart is in imaginary units.

The “actual performance” line chart renders performance in number of records processed per second, so bigger is better. The Saikuro and Flog scores were normalized on their values for #transform_via_dom.

The “DOM parser on various platforms” bar chart renders total benchmark runtime, so smaller is better.

6
comments:

I'm not too sure that using Nokigiri's parser in ActiveSupport would be such a good idea. Sure, performance is important, but there are other considerations also. I have a project that includes the Nokigiri gem, which takes about 10 times as long to load on startup than it does without Nokogiri.

I wasn't suggesting adding Nokogiri to ActiveSupport. I was suggesting that someone build a gem that implemented #from_xml using Nokogiri, and that could be used in place of those ActiveSupport methods.