15.23. Quantity and Selection of Test Cases

Applies to: NeuralTools 5.x and newer

I'm using NeuralTools for time series predictions. Should I allow a random % for testing? Is it preferable to use the tag feature to test on sequential data, maybe using 1995-2014 data in training and 2015 in testing? What is the recommended amount of data to test?

The answer depends on the type of neural net. GRN nets are used by default for numeric prediction, and for these nets it makes a difference whether we use randomly selected data points for testing, or rather the final time period. GRN nets interpolate from known data, and it's easier to interpolate inside a gap, where we have known data on both sides of the gap. Therefore we recommend tagging the final period for testing with GRN nets.

MLF nets try to figure out the underlying function, and there it's OK to use randomly selected cases for testing.

For the amount of testing data, 20% is a good rule of thumb. When using randomly selected cases, the Testing Sensitivity Analysis can help figure out the number of cases to use.