26 June 2011

I used Boost Program_options for my research code's input parsing requirements. As expected, it works well. For floating point quantities, I've found having to specify 4.18879020478639 for 4π/3 is a bit cumbersome. When specifying arguments on the command line, I could use any one of a number of hacks to perform floating point arithmetic in bash. But that didn't help me when taking input from response files. I wanted an arithmetic expression utility that could process a C-like grammar. Long double accuracy was a requirement. I looked around a bit. GNU libmatheval and muParser appeared to be good fits but looked to be double precision-only. Though straightforward, I decided against combining the Shunting-yard algorithm with some in-place RPN logic because I wasn't enthusiastic about either finding or writing an appropriate (and admittedly toy simple) lexer.

Already having a Boost pre-requisite in my code, I decided to try out Boost Spirit since it could provide a header-only, precision-agnostic solution. Spirit has gone through a lot of significant revisions and many articles seemed to be written for the older Spirit Classic. The latest official reference manual gave several excellent examples that almost fit. I adapted Spirit's calc3.cpp and roman.cpp sample grammars to accomplish what I wanted. I added ** as an operator for exponentiation and pi as a built-in constant. I also added a smattering of special functions (e.g. exp). The task required about 150 lines of Spirit-based code. Further extensions should be fairly straightforward.
My sample code does the following:

Declares X-macros for the unary and binary functions built into the grammar.

Declares Boost Phoenix-ready functors for the unary and binary functions.

Declares a helper called parse to hide needing boost::spirit::ascii::space when performing whitespace-agnostic parsing.

Declares a small test macro called EXPRTEST to reduce test case boilerplate.

Runs a collection of tests to ensure the parser is behaving as expected.

Compared to the samples on which I built, the biggest changes to the grammar were adding primary and reworking factor. I could have shortened up some details with a few using boost::spirit::qi; declarations, but I wanted to keep track of exactly which parts of Spirit I used. Feedback and suggestions are most welcome.

Updated 9 Jan 2012: During a second pass motivated by reducing compilation times, I removed the X-macro nastiness and replaced it with some qi::symbols-based logic. The code in the anonymous namespace could likely be removed if I could figure out the right boost::phoenix::bind hoodoo to use to replace lazy_ufunc_ and lazy_bfunc_ (any suggestions?). I have not yet measured the difference in either memory footprint or runtime performance. Compilation time and memory is greatly reduced for both GNU and Intel compilers.

Updated 31 Mar 2013: I have moved the code into a gist to facilitate both maintenance and gathering feedback. I have added the constant e and slightly reordered the primary grammar per the comments.

Updated 17-19 Sept 2013: I have added an MPL2 header to the gist. In the comments is a suggestion for how to use standalone MPL2 logic building atop this grammar from my project Suzerain to incorporate such parsing into your own, possibly non-MPL2, source tree.