Discussions

Michael Feathers has put together a document on unit tests that asserts that what many developers think are unit tests aren't.

Generally, unit tests are supposed to be small, they test a method or the interaction of a couple of methods. When you pull the database, sockets, or file system access into your unit tests, they aren't really about those methods any more; they are about the integration of your code with that other software. If you write code in a way which separates your logic from OS and vendor services, you not only get faster unit tests, you get a 'binary chop' that allows you to discover whether the problem is in your logic or in the things are you interfacing with.

He defines a set of criteria that suggest something isn't actually a unit test:

A test is not a unit test if:

It talks to the database

It communicates across the network

It touches the file system

It can't run at the same time as any of your other unit tests

You have to do special things to your environment (such as editing config files) to run it.

Tests that do these things aren't bad. Often they are worth writing, and they can be written in a unit test harness. However, it is important to be able to separate them from true unit tests so that we can keep a set of tests that we can run fast whenever we make our changes.

I've found that unit tests are nearly worthless for webapps. I just do integration testing now (JWebUnit stuff) and I'm a lot more productive for it.

Well, either(1) your webapps have no (business) logic; or(2) you are not writing very unit-testable code.

I think the contribution is very good...too many books out there now professing to be your answer to your unit testing blues. All you really need are a good set of guidelines, of which those presented are an excellent start. I would add one more rule:

I agree with Cedrics point of view and would like to expand a little on it and the problems inherited with unit test. These are a few points off the top of my head:

1. Unit tests treat compiled code as the only first class testable unit.

2. Unit tests do not hold a solution for composed structures

3. Unit "class as king" perspective rigid and actually irrelevant.

This discussion is aimed at the unit test "purists" that actually sits down and compose small bulleted lists ala Michael Feathers that defines what a unit test should be in their eyes. Answers are listed separately below.

Discussion:1) Saying that other related data such as databases and configuration files are not relevant for the unit testing is complete and utter bullshit. It should be tested to the maximum possible extent. Take for example a system that is centered around a tree structure of data stored in a database. The system may not even have a mening without that data. This makes that data a first class citizen of the system, which implies that it should be tested along with the rest of the code. If this means running the tests in a deployed environment, so be it.

I make the sanity check of the configuration and startup state a part of the system to the maximum possible extent. This may mean database access, configuration file finding and parsing.

This does not of course imply that you shouldn't write a test that tests a single class where applicable, on the contrary.

2) This is a big problem for unit test purity. When you build for example, an engine, you have a rigid quality assurance of the parts that compose the engine. But the main testing is done with the engine composed togheter. You may have a lot of places where you can hook in and check the status of the engine, but the workings of the engine is not tested in parts. This holds true for engine development and it holds true for software production. The tests should measure functionality, not classes. The class perspective is an artificial view imposed upon us by the now existing languages. Today classes, tomorrow something else.

3) This means that some unit testing standards idiots that I have happened to stumble across measure their testing coverage in the number of classes that exists has a corresponding test class. This is of course, an irrelevant figure. It would be like measuring the quality of a real world process by the number of documents describing it, not the output it produces.

The real useful measurement is of course how large portion of the codebase is being exercised and found healty. This is done much better with a code coverage tool used in conjunction with your test. An ideal situation would also measure how much of any other related data and logic where measured when the tests runs.

If focus would have been more on coverage tools and less on then unit tests, we would have a much healthier system development process today.

1. Unit tests treat compiled code as the only first class testable unit.2. Unit tests do not hold a solution for composed structures3. Unit "class as king" perspective rigid and actually irrelevant.

The "class as king" perspective comes from OOP design principles. The whole point of an object is that its interface describes the desired functionality of your system. So to test the functionality of your system, your testing would revolve around testing the operations offered by your object. One of JUnit's authors was the GoF's Erich Gamma. So of course its testing philosophy is going to be built around OOP principles.

Now I'm not saying that your app must be OO or there is something wrong with it. However, if it is not OO then it might be quite awkward to use a testing framework that expects it to be OO, like JUnit.

This is in response to:>>>>1) Saying that other related data such as databases and configuration files are not relevant for the unit testing is complete and utter bullshit. It should be tested to the maximum possible extent. Take for example a system that is centered around a tree structure of data stored in a database. The system may not even have a mening without that data. This makes that data a first class citizen of the system, which implies that it should be tested along with the rest of the code. If this means running the tests in a deployed environment, so be it.<I thought I should let you know that I started reading your post with great interested but stopped after the above paragraph. To imply that because units being tested not accessing the database means that the unit will not test data is silly. Clearly the unit needs to do all the work that it's doing now, but the data comes from somewhere else -- maybe an XML file. So you're not testing whether the SQL statements are returning the right data, but you can make the assumpiton that SQL works, and if you doubt it, you can have another test that verifies that the SQL statements are returning the right type of data from the SQL Server, outside the context of the business logic.

The main advantage in my opinion are 1) speed, and 2) isolation so that if the code isn't written yet, you can have two people work togehter at the same time.

I will use term "build-time test", to isolate naming issue: "What shouldwe call unit test?"

So, build-time test should test consistency of source code, notconsistency between source code and external resource. If build fails,then you have to fix source code. It's not a good idea to have to alterdatabase, or have to change Windows Registry to make build successful. Thepurpose of build procedure is to make something from the source code. Ifthere is no problem in the source code, then build should not fail.

There may be other automated tests your software should pass before youmake release, but not on each build.

I also agree that if a test is not suitable for build-time then we shouldnot call it an unit test.

You can run it, easily and quickly (say less than 1 min), with repeatable results in your normal dev environment.

Whether or not it touches a database or filesystem is irrelevant. The problem with excluding such test is that in many systems the *majority* of error prone coding is the interfacing with libraries, the OS etc and other things outside your actual code - the actual "business logic" is well contained and often trivial (as a proportion of the whole system). Also you can miss out on detecting bugs in some of the more subtle stuff like interactions between your business logic database and say the db isolation level or transaction behaviour, which can be real killers.

Unlike Paul, I ensure my unit tests typically execute in 0.1 seconds rather than < 60 seconds. Given that I'll need to run many thousands of unit tests across my codebase before I commit to version control (many times a day), anything that takes more than a second is going to slow me down significantly... manytimes a day.

This assumption sits nicely with the definition of a pure unit test (i.e., "unit isolation" versus "unit integration") and allows me to push longer running integration tests to a different part of my build process where they can be run on a different schedule.

Like most people who have posted here, I think that automated testing is the most important thing, but I do feel that people need to have a good grasp of the different layers of automated testing and their respective place in a testing architecture.

The line I drew was somewhat arbitrary. On my current project for example we have a few dozen tests which excercise our hibernate persistence layer and they run in a few seconds cumulatively. To me that is well within the bounds of unit testing.

On a previous project I have had tests that ran directly against session beans in a container and might have taken 20 seconds to run the suite which is probably pushing it a bit.I still classify these as "unit" tests though.

You can run it, easily and quickly (say less than 1 min), with repeatable results in your normal dev environment.

Whether or not it touches a database or filesystem is irrelevant. The problem with excluding such test is that in many systems the *majority* of error prone coding is the interfacing with libraries, the OS etc and other things outside your actual code - the actual "business logic" is well contained and often trivial (as a proportion of the whole system). Also you can miss out on detecting bugs in some of the more subtle stuff like interactions between your business logic and say the db isolation level or transaction behaviour, which can be real killers.

Though, I don't see why he says they're becoming obsolete. I think Michael's definition covers a subset of unit tests - logic only tests, which are not only isolated from other classes, but from the environment as well.

I disagree with Erik, and others who say these tests are worthless. It's great to have near instant feedback that all your logic is in line with known requirements. It's not the only valuable kind of test - probably not even the most valuable. But it's good to group these tests seperately, so you can run them quickly and disconnected from the environment if needed.

I do not say that single class tests are worthless, they are quite valuable. Quote from previous post:

"This does not of course imply that you shouldn't write a test that tests a single class where applicable, on the contrary."

What I do say is that they are overrated as the one all be all solution. If you read Cedrics reply again you will see that what he's saying is that that type of rigid definition is becoming obsolete and that he argues for a more open test structure, where tests are grouped into functionality instead of classes.

I do not say that single class tests are worthless, they are quite valuable. Quote from previous post:"This does not of course imply that you shouldn't write a test that tests a single class where applicable, on the contrary."

Erik, sorry if I misrepresented your position...You make some good points.

Although Cedric Beust is saying that Michael Feathers' rules are wrong, they both come to the same conslusions:

* (automated) tests are good* not all tests are equal (there are fast tests, slow tests, tests testing one single class, integration tests, ...)* ... and so they have to be treated differently

And whether you structure your tests in different (TestNG) test groups or in different (JUnit) test suites is pretty much the same thing (Cedric is right that a JUnit test case cannot belong to two packages, but it can certainly be in two test suites).

But another important issue has not been discussed here so far (David Saff pointed this out in Cedric's weblog):test cases accessing a database, the network or the file system may indicate that the "unit" you are testing is doing two (or more) things at the same time. (So if your "unit" is a single class, you might already be violating the Single Responsibility Principle.)

This would be the case if you needed a database connection to test if your "AccountService" calculates the new balance correctly.There are two different things to test here:* is the balance calculated right* is it written to the database correctlySo there should also be two different tests (using mock objects wherever necessary).And if this cannot be achieved easily, then probably it's just your application's design that's bad ...

So I don't think that "Unit Tests Are Disappearing" - mainly because I would still call it a "unit test", even if I were using TestNG ;-)

But another important issue has not been discussed here so far (David Saff pointed this out in Cedric's weblog):test cases accessing a database, the network or the file system may indicate that the "unit" you are testing is doing two (or more) things at the same time. (So if your "unit" is a single class, you might already be violating the Single Responsibility Principle.)

This would be the case if you needed a database connection to test if your "AccountService" calculates the new balance correctly.There are two different things to test here:* is the balance calculated right* is it written to the database correctlySo there should also be two different tests (using mock objects wherever necessary).And if this cannot be achieved easily, then probably it's just your application's design that's bad ...

I think Cedric's point is that there is no well-defined "unit" that applies individually, and that it may not make sense to test every snippet of code in the smallest possible granularity.

On that note, some people believe that "more unit tests == better!", but I'm in the camp that believes that if your testing code size equals or exceeds your primary code, then you've got a serious problem. Tests _are_ code, and like any code you want to be smart about how you write it and how you prioritize your work. Whether your tests meet some consultant's idea of "unit" is largely immaterial. What matters is your environment, your context, and your requirements.

So for example, where you say "So there should also be two different tests..." you are making a huge number of assumptions, any one of which may be wrong. There are probably excellent reasons why someone would want to test a full-through account scenario (including the domain object and database access) in one automated test. Whether you call this a "unit" or an "integration" test is completely besides the point - if you can run it really fast and really often, then you should. There may in fact be individual tests for the components but there are still very valid reasons to test the whole too. And if you can do so in a few seconds than you probably should.

Feathers' rules are wrong for the simple reason that they over-obsess on an arbitrary and artificial definition of "unit", and people who give advice like that are doing a disservice to the community by focusing people on super fine grained testing and never offering a hint of advice on (much more important IMHO) automated integration and higher level tests. Unit tests are good, but they are not the whole story. In reality, not even the most important part of the testing story. Often this style of consultant will tell you that a system without unit tests "has never been tested". I have a different view - a system without automated higher-level tests has never been tested. In the end these low-level, 100% decoupled and mocked automated tests tell you nothing about your actual system.

The point I am trying to make is not that nobody should write any automated integration tests and run them on a regular basis (even at every build). Of course you should also automate your integration tests instead of executing them manually (or having no integration tests at all)!

And I don't want to start off (or join) a discussion, whether you call it a unit test if you test a single class, a whole component, or whatever. Or if you call every test a unit test, just because you are using JUnit to run it. I couldn't care any less about these kinds of things.

The point is, that if your code base forces you to connect to the database just to check if two numbers have been added correctly, this should ring your alarm bells.If there is no easy way to replace the database access code by some mock object storing the result of the calculation in a member variable, this indicates that the code might not be as reusable and maintainable in the future as one would like it to be.

1. Unit Tests should be Fast2. Unit Tests should be strictly repeatable3. Unit Tests should not affect other developers

#1 Says if they aren't fast, you will get tired of running them. That's not to say you shouldn't have long-running tests. You should.

#2 Says the test shouldn't modify any sort of system state that cannot be automatically restored when the test completes, and it shouldn't depend on anything that will change between executions of the test. This means if you can quickly make a copy of a "reference" database, modify it in your test, and then delete the copy (or keep the copy isolated), you can access it in your unit test.

#3 No modifying or accessing public resources because you don't know who is changing them, or if you are changing them, who you are affected.

Mr. Feathers article was right on point. One really needs to get back to the purpose for automated unit tests. I believe the purpose is perhaps a few other things but minimally to provide a mechanism for timely developer feedback as to the health of a system as he/she is adding new functionality and/or refactoring old functionality.

I would share a similar experience to Michael's. I recently became aware of an application in our company that was voiciferously complaining of outages of our test LDAP infrastructure. Upon closer investigation it turned out that their concern was primarily associated with the test LDAP server outages interfering with their junit test runs. One analyst's words were "it takes 3-4 hours to run our junits, when something in the infrastructure hiccups, we have to start the run all over....". hmmmm.... As the story unfolds I also learned that the team was actually structuring their work efforts such that the team coordinated their code committs one or twice a day because if they didn't conflicting code committs might "break the build" and result in another 3 hour build execution. hmmm.... again.

It is in my opinion really a matter of developer feedback from perspectives of reliability (will the test execute the same way every time regardless of infrastructure health) and performance (when I commit code will I get feed back sometime today perhaps?). It's not so much that automated tests depending on infrastructure (db, ldap etc...) are bad, it's that these kind of tests can and do interfere with the above stated goals of reliability and performance.

I'm guessing that the team I mention above started with a test suite that executed in a timely fashion even though it utilizes external infrastructure. This scenario evolved slowly, kind of like boiling the frog you don't know you're in trouble until its too late.

I suggest the following. Adhere to Michael's rules for the suite of tests whose purpose is to provide quick feed back during development/refactoring etc... Perhaps set a unit test execution elapsed time goal to which you attempt to adhere. At the point you feel the need (and I do believe it is a legitimate one) for functional/integrated tests that utilize external infrastructure dbs, ldap, web services, app server container etc... structure these tests still as automated ones but such that they execute on a less frequent timing perhaps actually deploying the app to an app server container and driving tests via jwebunit or whatever...

Another point to ponder is the effect on infrastructure costs in a large company encouraging everyone to write junits that execute every 20 minutes and touch ldap servers, app servers, db servers etc... All that test infrastructure isn't free as you well know and perhaps with a correct segregation of thin unit tests and thick functional tests test infrastructure can be used more cost effectively.

TechTarget provides technology professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective purchase decisions and managing their organizations technology projects - with its network of technology-specific websites, events and online magazines.