> > What I want to do with this is check that my code still> produces the same> > results given the same data.> > In cases where there are many valid answers, you need to> check that the result is correct not whether it is the> same as previously.

I'm sorry. I don't follow. Are you talking about some sort of non-deterministic logic? In all my code, given all the same inputs the output will be the same (with some caveats.)

One thing that I don't stub out are things like generated ids and timestamps. For those, I put an 'ignore' token in my expected output. But I try to avoid doing this. Once the generated id is retrieved, I can use it in the later steps. That is, I don't check it on creates but I do check it on updates. I don't validate the content of stack traces. If the error message is the same, that's good enough.

The whole point of this approach is to avoid having to write logic to check that the answer is correct. There are two ways to do that: reproduce the logic that you use in the code into the test or come up with different logic that comes to the same result. The problem with the former is that is don't really prove anything other than that the same code does the same thing and the problem with the latter is that you end up with two sets of code to maintain.

If you can structure the test such that it will always produce the same output (with a few exceptions) you avoid both of these issues. The tests are strictly validating the results.

> > If you obsessively do TDD, you write tests for code> that> > you are pretty much guaranteed to throw away. > > Dingoes Kidneys!> Mastodon Tonsils!>> Now, maybe I just write better tests than others. Here's> a hint. I write them first.

On the FitNesse site, I find this quote:

"If you use FitNesse without using JUnit or NUnit to test-drive and refactor your code, you may end up with tremendous business value, but rotten, buggy code."

> > In cases where there are many valid answers, you need> to> > check that the result is correct not whether it is the> > same as previously.> > I'm sorry. I don't follow. Are you talking about some> sort of non-deterministic logic? In all my code, given> all the same inputs the output will be the same (with some> caveats.)>

A simple example is that the order in which a hash table is iterated is usually not part of its contract --- any order is legitimate. In the case of identity hash maps, the order can depend on the address at which an object is first allocated or whatever else is used to determine identity hash codes. These values can change depending on factors which may not be controlled by a test framework.

Now a problem such as loading items into containers subject to capacity can often have many solutions. Which one is found can depend on the order in which the problem data is presented. Run the test under a debugger and the answer found can easily change.

> A simple example is that the order in which a hash table> is iterated is usually not part of its contract --- any> order is legitimate. In the case of identity hash maps,> the order can depend on the address at which an object is> first allocated or whatever else is used to determine> identity hash codes. These values can change depending on> factors which may not be controlled by a test framework.

Ah, in those cases I either make the order deterministic at the source or in the test. For example, I have used a recursive XML sorter to deal with scenarios where order is not important.

> Now a problem such as loading items into containers> subject to capacity can often have many solutions. Which> one is found can depend on the order in which the problem> data is presented. Run the test under a debugger and the> answer found can easily change.

The order of data is part of the input. If the order is presented the same way each time, you should always get the same result. I suppose that if you had randomization as part of the routine that might make it a little trickier. But this is exactly the kind of thing where I see mocking having a benefit. You could test with a random number generator that always presents the same sequence simply by seeding it with a constant value.

Note that the point of this is not to prove the code correct. It's to validate that nothing changed other than what was intentionally changed.

Thanks, Cedric, and thanks for the link to your blog. Very interesting discussion.

I asked about unit testing private methods because I'm a big proponent of testing as many non-trivial methods as practicable.

Maybe it's a bit simplistic, but, to me, every non-trivial method is fundamentally the same - i.e., a unit of work that accepts input, interacts with other "things" (methods in the same and/or other classes), and generates output.

Unit testing helps me feel more confident that a unit of work does what I designed it to do and that I've considered edge case result sets and exceptions that might be thrown its way. To me, this is a valuable effort regardless of the method's modifier.

Please don't get me wrong - testing public, API methods (through integration/system/functional tests) are certainly worthwhile endeavors. But I'd propose that they don't obviate the need for unit testing as much as we can - even private methods.

Al, I totally agree. As I said in my blog post, "If it can break, you need to test it".

I find the position of extreme testers/XP/Agile/etc... contradictory on this debate: they want aggressive unit testing but are traditionally opposed to testing private methods, saying "only test the public API".

Here's the contradiction:

Advocating for aggressive unit testing means that when a problem occurs, you want to find it as soon as possible. If all you have is functional tests, the bug that causes that test to fail might be deep under layers of code. If you have unit tests, these unit tests will fail earlier than functional tests and therefore, you will identify the faulty code faster.

However, arguing that public testing is better than private testing contradicts that very position: by restricting yourself to higher testing, you are making it difficult on yourself to locate bugs.

> > Ah, in those cases I either make the order deterministic> at the source or in the test. For example, I have used a> recursive XML sorter to deal with scenarios where order is> not important.>

Java's standard library has a LinkedHashMap which does preserve order, but it doesn't have a LinkedIdentityHashMap. I have quite a few IdentityHashMap's so I have to accept ordering that changes as a result of uncontrolled changes.

I also have algorithms that use random numbers, but in those cases I make provision to set the seed for testing purposes.

> Java's standard library has a LinkedHashMap which does> preserve order, but it doesn't have a> LinkedIdentityHashMap. I have quite a few> IdentityHashMap's so I have to accept ordering that> changes as a result of uncontrolled changes.

In this kind of situation, I sort the output such that if the data is the same the output is the exactly the same. That part has nothing to do with mocking. It's handled by the testing framework.

"If all you have is functional tests, the bug that causes that test to fail might be deep under layers of code. If you have unit tests, these unit tests will fail earlier than functional tests and therefore, you will identify the faulty code faster."

Not sure how TDD and "functional tests" got related here...

Using TDD means that all the code (including the "faulty" parts) has been added *in order to make one or more test pass*. Therefore these same tests will fail very quickly.

Please take this as my position; not "the position of extreme testers/XP/Agile/etc..". (Not sure I want to get into such arguments... ;-)

> "If all you have is functional tests, the bug that causes> that test to fail might be deep under layers of code. If> you have unit tests, these unit tests will fail earlier> than functional tests and therefore, you will identify the> faulty code faster."

That's true, but how much of a problem it is in practice? You make a change, run the functional tests, they fail - the problem must be in your latest change, no?

> > "If all you have is functional tests, the bug that> > causes that test to fail might be deep under layers of > > code. > >> > If> > you have unit tests, these unit tests will fail earlier> > than functional tests and therefore, you will identify> > the faulty code faster."> > That's true, but how much of a problem it is in practice?> You make a change, run the functional tests, they fail -> the problem must be in your latest change, no?

My take is that, unit test do often make it easier to find and fix issues. They don't prove that a system works, however. IMO, this makes them optional. Functional tests are not optional.

The question then becomes: if we are going to have comprehensive functional testing, does the benefit of creating unit tests outweigh the cost? I think the answer depends on what you are doing. The more the code is reused, the more value unit tests have.

> The question then becomes: if we are going to have> comprehensive functional testing, does the benefit of> creating unit tests outweigh the cost? I think the answer> depends on what you are doing. The more the code is> reused, the more value unit tests have.

Yes, unit tests make more sense for the infrastructure/library code. Also, there are another factors: I would say unit-tests add more value if you are using a dynamic language.

Code that works can still have bad internal structure. The "bugs" in that kind of code are not behavioral. Rather they are the impedance that the bad structure imposes on change. I think everyone will agree that if it takes months to make what ought to be a simple change, then the code is defective, even if it passes all acceptance tests.

> > Can someone explain to me how rotten buggy code can> pass> > (comprehensive) functional acceptance tests?> > Code that works can still have bad internal structure.> The "bugs" in that kind of code are not behavioral.> . Rather they are the impedance that the bad structure> imposes on change. I think everyone will agree that if it> takes months to make what ought to be a simple change,> then the code is defective, even if it passes all> acceptance tests.

Not to mention that combinatorial math pretty much guarantees no functional test will ever come close to code coverage that unit tests can do.