7 Comments

> The full definition of correct behavior of code exists in the tests for that code.

Which is why we call them specs rather than tests.

> Any non-additive change to non-test code that causes no test failures is a valid change and does not reduce the correctness of the code.

No. There’s lots of untested code out there, and obviously you can delete it without causing test failures. You know that. Your law assumes that the test suite is comprehensive and expresses the desired functionality. I’m sure that’s in the spirit of what you’ve said, so perhaps I’m being nitpicky, but I think that that’s a key requirement and should be made explicit.

February 9, 2009 at 5:30 am

Adam Milligan says:

Pat, I obviously know about untested code. The point is that deleting or changing untested code, and therefore quite possibly deleting or changing desired behavior, doesn’t reduce the correctness of the code, but exposes a flaw, or flaws, in the test suite.

but it does reduce the correctness of the software. Feels like semantics to me, though.

I suppose the next corollary comes from your response as well… “Any such change that also breaks desired behavior exposes one or more flaws in the specification.” Fair?

February 9, 2009 at 5:55 am

Chad Woolley says:

> The point is that deleting or changing untested code, and therefore quite possibly deleting or changing desired behavior, doesn’t reduce the correctness of the code, but exposes a flaw, or flaws, in the test suite.

…exposes a flaw in your test *suites* (plural). This is the value of multilevel, multifaceted testing. It is folly to try to achieve 100% unit test coverage, but if you have good integration-, application-, and ui-level testing, you should always have SOME test that fails when you delete any required code.

Note that this is still different than (orthogonal to, if you like fancy words) test coverage. You can have 100% coverage for all your suites and still have no tests fail when you delete large swaths of required code.

I’d also say these goals are harder to achieve the closer you get to view or UI code. For example, changing all the CSS colors would obviously reduce correctness of the app, but would be very difficult (waste of time) to test automatically. Randomly reordering peer view-level elements in the DOM could also result in incorrectness, but would be pointless (and probably counterproductive) to test automatically.

This is really just a devil’s advocate argument designed to guile someone into developing a statistical model of aesthetics that can be expressed in metrics such as color scheme RGB values, amount of whitespace, and page layout. Then I can write “should_not be_fugly” and BDD my way to gorgeous CSS. :)