Overview

testR is a framework for tests generation from source code and their filtering based on C and R code coverage using the covr package. Testr uses the format of the popular testthat package for the tests it generates.

Unittests nicely complement the more broaded system testing by concentrating on a single element of the application (usually function) and testing it independently of the others. While this is not always technically possible (due to shared state, global variables, too complex arguments, etc.) many functions usually depend to a great extent only on their arguments. These small, finely targetted tests are ideal for fixing bugs because the failing unittest usually locates precisely the error in the application.

However, creating unittests by hand is time consuming and error prone because of lots of code repetition involved in sending the same, or similar arguments to the functions with only minimal variation over the different tests. testR offers a different approach - given the functions you want to create unit tests for, and a code that actually uses them (this may be vignettes, examples, more broader system tests, or even application code), testR executes the code and records the input arguments to the functions. From these inputs finely targetted tests can then be generated automatically without user intervention.

Common problem of such tests is that since testR captures every function invocation, there may be a lot of duplicities, or invocations that use the function in the same way (so having only one of tests from such group would suffice). TestR therefore offers an option to filter the generated tests to do precisely this - keep only those tests that actually test some yet untested part of the function.

This is accomplished using a so called code coverage. Code coverage analyzes which parts of the code have been executed and which have been not. testR inspects the code coverage of the package for existing tests, and then runs the newly generated tests one by one, inspecting the code coverage after each new test. Only tests which actually execute some previously unused part of the application are kept.

It is important to stress that testR does not actually create new tests from thin air, it just extracts the smallest possible test fragments from existing cod to be used as tests. Therefore the tests produced are only (a) as good as the the code they have been taken from and (b) the tests cannot determine whether the program behaves correctly, only whether it behaves consistently for a period of time.

Creating tests for the first time

For the purposes of this demonstration, we have created an example R package you can download from github using the following:

library(devtools)
install_github("reactorlabs/testgenthat")

This is a package with vignettes, examples, and some testthat tests that test the package in general, but you want to add more. You can use testR to run the existing code of the package (vignettes, examples and perhaps the tests as well) and generate fine grained tests for these captures. You can execute the following to generate the tests:

gen_from_package("testgenthat", verbose =T)

By setting verbose to TRUE, we have enabled verbose output so that we can see what is happening under the hood:

First, the package is built and installed. If you prefer to use already built version of the package, you can add the build = FALSE as an argument to the get_from_package function. The package is then reinstalled.

All functions from package will be decorated
Decorating 1 functions
Tracing function "isPrime" in package "testgenthat"

After the package is installed, testR starts tracking all functions in the package (there is only one, named isPrime).

Running vignettes ( 1 files)
Running examples ( 1 man files)

testR then runs the examples and vignettes of the package and if their code calls any of the package's functions, testR will record the input arguments and the result the function gave. If you want, you can run tests too - this would later generate the small grained tests from the exissting unit tests as well. While these will not help catch more bugs, they may help pinpointing what is wrong if the existing tests are rather large. If you want to execute existing tests as well, add optional argument include.tests = TRUE.

Generating tests to temp
Output: temp
<!-- Root: capture -->

After the code is executed, tests from it are generated into the temporary directory. The temporary directory will be deleted afterwards, but if you want to keep all the tests, you can pass the output argument to point to the folder when you want testr to dump the tests.

And finally, the tests are filtered against the existing ones (unless include.tests is TRUE) and each other and only those increasing the code coverage are kept. The final tests will be added to the testthat tests directory for the specified package, one file per function. If you want, you can skip the filtering by passing the filter = FALSE argument to the function.

Inspection of the tests directory of the package now reveals that a new file has been added (containing tests for function isPrime) with the captured test that increased the coverage:

Adding regressions

Unfortunately programs are not always correct from the very beginning. We can for instance quickly see that there is something wrong with our isPrime() function:

isPrime(2.3)
[1] TRUE

The problem is that the modulo operator used in isPrime function will work for fractional numbers too, just not in the way we have anticipated. We can open the isPrime.R file and quickly fix the function:

Now that the error has been fixed, we might want to create a test for it. genthat provides a function specifically for that purpose:

The gen_from_function function takes as argument the package for which test case should be generated and code that generates them. The code is expected to be a single function with no arguments and will be executed when the functions in the package are captured (by supplying the functions argument, only certain functions from the package may be captured). When the function finishes, we can inspect the generated testfile to see that a new test has indeed been added:

While the above described process is probably not that advantageous in the simple case, one can easily think of a more complex scenario where the automatic capture of the tests will be handier. Another possibility is to first generate the new tests and then inspect and correct the tests first, so that important clues to where to find the bug can be found.