Database

Tough Apps: Combinatorial Testing

By Dennis Shasha, June 25, 2010

Complex problems sometimes require clever testing

Dennis Shasha is a professor of Computer Science at New York University. His latest puzzle book Puzzles for Programmers and Pros outlined some general approaches to programmatic puzzle solving. This column challenges you to find the solutions to the puzzles that lie at the core of some cool tough applications. He solicits your help in identifying such cool apps.

Long-time Dr. Dobb's readers surely remember Dr. Ecco's Omniheurist Corner, where Dennis Shasha posed puzzles that required computer solutions. Dennis is returning to Dr. Dobbs, but with a new twist on puzzles -- one that takes inspiration from the challenging apps that readers have faced and to turn those into puzzles.

If you have written an application -- or run across one -- that required extensive use of heuristics (because no algorithm would solve it) or that is otherwise algorithmically interesting, please send mail to Dennis at shasha@cs.nyu.edu. Put "Tough Apps" in the subject line. If Dennis likes it, he will contact you, try to understand what you've done, and create a puzzle explaining the algorithmic core. Please include a blurb about yourself and the problem.

In the 1990s, four researchers at Telcordia found a good way to test software. Instead of trying to achieve high code coverage (which might not be possible when one doesn't control the source), they proposed to test the external behavior of the software in a clever way. By "external behavior" I mean testing the interface and user-settable configuration parameters. For example, if testing the implementation of a language, one would test language constructs and optimization parameters.

Why does one need to be clever about such testing? The short answer is -- combinatorial explosion. A language will often allow a countably infinite number of statements. It's impossible to test them all. The good news is that they found that many software bugs occur because of an interaction between two factors (e.g., two language features or perhaps a language feature and a configuration parameter), so if they could test every combination of values of every pair of factors, they would often catch most if not all bugs. Because even a single factor may have an infinite number (or at least billions) of values (e.g., all floats), one has to discretize the test space. This may mean extensive sampling on boundary conditions and a few samples of large intermediate sets of values.

For example, suppose you are testing a new SQL engine. Optimization parameters include buffer space, query caching, replacement algorithm, and so on. Language features include the formation of the SELECT clause, number of tables, the WHERE clause, grouping, aggregates, and subqueries.

In my work, I like to lay out each factor in a line along with its values as you can see below:

Testing all these combinations would require 5 * 2 * 2 * 3 * 4 * 3 * 3 * 3 = 6,480 experiments. You may consider this to be too many, so you'd like to do far fewer but still satisfy the two factor condition, viz. for every pair of values in every pair of factors, some experiment tests this pair. The puzzle is to find such a small set of experiments.

On the left you see the experiment number followed by the values of each factor in each experiment. For example, experiment 5 has value 3 for factor F1, and 1 for all other factors. The two factor condition is achieved because for any pair of factors e.g. F1 and F4, every pair of values is found in at least one experiment. Experiment 1 has F1 = 3 and F4 = 3, experiment 2 has F1 = 1 and F4 = 3, experiment 3 has F1 = 2 and F4 = 2, experiment 4 repeats F1 = 3 and F4 = 3 and so on. A covering set for F1 and F4 would be experiments 11, 7, 2, 9, 3, 12, 5, 10, and 1.

End of Warm-Up

Here is the challenge for you: Can you build a program that will give you a set of experiments that satisfies the two factor condition for the SQL case?

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task.
However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Video

This month's Dr. Dobb's Journal

This month,
Dr. Dobb's Journal is devoted to mobile programming. We introduce you to Apple's new Swift programming language, discuss the perils of being the third-most-popular mobile platform, revisit SQLite on Android
, and much more!