Automated testsuite cleanup does not know about random file A.sh, and does not clean it

Test Z fails because it so happens that A.sh has no influence on all other tests but causes failure in this particular test

These issues are incredibly hard to debug and to follow, because test A could have been written 3 years ago by people who left 2.5 years ago. I wonder, how would you implement a robust and also performant check of multiple files?

I have a following workflow in mind:

Before test: Save state of all files in the watched directory. Note that the number of files we truly care about is ~30 in around 2-3 directories.

During test: Test modifies them however it wants.

After test: The testsuite checks the state of all configurations. If the state matches beforeTest state, we move on. If not, we put it back to its original state.

In my mind, this is similar to:

git add . && git commit -m "Original state"

Test modifies whatever it needs to

git reset --hard && git clean -fd

I am almost tempted to literally use git for this, i.e. initialize repository with original state and then simply reset the state. Is there a better way?

As it stands, we currently copy all the config files before testsuite (the original state) and after each test, we copy them back (restore original state), regardless of which configs changed. This suffers from:

Being slow because all the configs are written onto the disk regardless of which files were used

Mainly, if there is a new file that we did not account for, because it is a custom file that does not come with the server, the testsuite does not know to remove the file.

Is there any elegant solution to this problem that you'd recommend? Is using git something that would make you track me down 5 years later when I'm not at the company anymore, with an axe to grind?

Other possible solutions:

Spin up a small DB where I'd save all the files with hashsums at the beginning of a testsuite and check all files against the DB?

If the hashsums match, we don't worry about it.

If the hashsum is not found in the DB but the name is, restore it

Else, delete it

My worry is how performant this would be, though if we're talking about in-memory DB, could be OK. Also, another worry of mine would be that this could be the most fragile part of the testsuite. Afterall, tests should be as independent as possible.

Note that we use one testsuite for up to 5 versions of our software, so the set of ~30 files and ~2 directories differs, and hence has to be somehow cached/remembered at the beginning of each firing of the testsuite.

Constraints

We are talking about testsuite only. I cannot make changes to how many configuration files we have to watch

We are talking about running the testsuite on multiple architectures; the solution at the very least has to be available for Linux, Windows, and Solaris (hence Java)

Using a different language from Java is OK, but has to be very well reasoned. I don't mind writing the solution in a separate small little C binary and cross-compiling it, but this adds a large amount of complexity and as such, it is not a decision to be taken lightly.

Is there any elegant solution to this problem that you'd recommend? -- Yes. Write more unit tests and fewer integration tests. What's that? You say you'd have to change your code to make that happen? Exactly. You'd have to change it to be more testable.
– Robert Harvey♦Aug 14 '18 at 20:37

The solutions you are considering are a band aid – a quick fix that hides the problem but doesn't solve it. This might be a good tradeoff. Or it might end up making any problems even more difficult to debug. The only long-term solution is to fix your tests to never touch files outside of a temp-dir that can be wiped after each run.
– amonAug 14 '18 at 20:48

1

I would not make the process dependend from a specific VCS like git. Keep your input files in some directory where they stay untouched. When you expect the test to modify them, copy them into a working directory before the test starts (and clean the working directory beforehand), then run the test on that working directory. When the test is complete, let the working directory where it is, giving you a chance to inspect it in case the test failed. For each new test, use a different working directory.
– Doc BrownAug 14 '18 at 20:49

1

@pydoge Then that is a bug in your tests. Consider adding a self-test like “assert that the working directory is empty after the tests complete”. If that check fails, mark the test run as failed and fix the bug. I'd rather not have a separate software guess which files should be reverted to which state.
– amonAug 14 '18 at 20:54

1

"I will not write a module complex enough to need testing that also touches the file system directly". Write that a thousand times on a chalk board and it should fix the problem.
– candied_orangeAug 15 '18 at 2:09