Why does automated testing keep failing at my company?

You can use a carrot, or a stick, or get a new company policy.

This Q&A is part of a weekly series of posts highlighting common questions encountered by technophiles and answered by users at Stack Exchange, a free, community-powered network of 90+ Q&A sites.

Mag20 wants to implement automated testing at his company. Problem is, he's tried several times before, but has failed every time. "Everyone gets excited for the first month or two," he writes. "Then, several months in, people simply stop doing it." But now seems like the right time to try bringing automated testing back to the workplace—Mag20's team of 20 experienced developers are about to embark on a big new project.

How can he finally introduce automated testing at his company?

Be okay with messing up

The hardest part of doing unit testing is getting the discipline to write tests first and early. Most developers are used to just diving into code. It also slows down the development process early on as you are trying to figure out how to write a test for the code. However, as you get better at testing, this speeds up. And because of the writing tests, the initial quality of the code starts off higher.

When starting out, try just to write tests. Don't worry so much about mocking/stubbing things in the beginning. Keep the tests simple. Tests are code and can/should be refactored. Though along those lines if something is hard to test, it could also be the design. TDD does drive towards using most design patterns (in my experience, particularly the Factory pattern).

Make sure that the tests get a level of visibility. Integrate them in the release process and ask about them during code review. Any bugs found should get a test. These things are where TDD shines.

One thing to keep in mind when you are writing tests: you are not trying to specify anything about the implementation of the code, only the behavior. When you write code, you test it all the time. Trying to execute it with debug statements and so on. Writing tests formalizes this and provides a record of the tests that you have. That way you can check your functionality confidently without accidentally skipping a test case that you remembered halfway through the development process.

Who believes in testing?

Most unit tests are questionable in value, since the vast majority of tests seem to be too simple.

It is much harder to write good testable code than just working code. There's a large percentage of the developer community that believes in "just get it to work," versus code/design quality in itself. And an even larger percentage who don't even know what quality code is.

It can take much longer to write the unit test code than the actual code itself.

Figuring out how to adequately test the more complicated code (i.e. the stuff you really are interested in thoroughly testing) is beyond many developers capabilities.

Maintaining unit tests takes too much time. Small changes can have big ripple effects. The main goal of automated unit tests is to find out if changes broke the code. However, 99 percent of the time what ends up breaking are the tests and not the code.

With all the above problems, there still isn't a better way to be able to make code changes and have some level of confidence that something didn't unexpectedly break than by automating your tests.

Some of the above can be alleviated to some degree by not going by the textbook of unit testing.

Many types of designs/applications are better tested by automating tests at the module/package level. In my experience, most coding errors are not because the code in a class was coded incorrectly but because the coder didn't understand how their class was supposed to work with other classes. I have seen a lot of bang for the buck in this type of testing. But once again, these tests are harder to write than unit (class level) tests.

It really boils down to whether the developers believe in the process or not. If they do, then they'll write good unit tests, find errors early and be proponents. If they don't, then their unit tests will be by and large useless and won't find any errors and their theory of unit tests being useless will be proven true (in their minds).

Bottom line is that I've never seen the full blown automated unit testing approach work for more than a couple months myself, but the idea of automated unit tests still persists although we are selective in what really needs testing. This approach tends to have far less critics and is more accepted by all the developers rather than just a few.

Econ 101

One thing that I haven't seen clearly addressed in the answers above is that unit testing is essentially a public good and a private cost. I've written a blog post about it.

What it comes down to is that while a suite of tests benefits the team or an individual developer, writing the test is a cost to the one doing it, most of the time.

In short, unless writing the test is enforced somehow—and the answers above list a number of different ways to do that—there's no reason for an individual developer to do this.

In one company that I've worked at, writing unit tests was a required part of delivering a feature. New code was not accepted unless a unit test was part of the commit or new feature—there were brief code reviews for every "task" that a developer was given. It may be worthwhile to implement a similar policy at your workplace.

Find the original post here. See more Q&A like this at Programmers, a site for conceptual programming questions at Stack Exchange. And of course, feel free to ask your own.

77 Reader Comments

Is it just me, or do these 'stack exchange' articles seem to be predominately about trouble adopting a testing strategy at a company? Not all, but... I'm pretty sure I've seen almost this same topic 3 or so times before.

" The main goal of automated unit tests is to find out if changes broke the code. However, 99 percent of the time what ends up breaking are the tests and not the code."

The goal of unit tests is supposed to test the public interface, which should rarely change. During the system's design phase, most of your units and how they will work should be defined. If your public interfaces keep changing and require unit tests to be changed, then you need a better design phase.

Additional features should mostly be modular and require little to no changes to the rest of the system.

In my experience, it's definitely a time / cost issue. This is especially so in the world of consulting where you have the perfect storm of lies, greed, idiocy, and politics all conspiring against quality and security. In a nutshell, it goes like this:

1. Client has the Big and Important project, but lacks the in-house expertise.

2. Client engages several consulting firms, with each firm putting its P.T Barnum out there to fellate the client and exploit their weakness ( stupidity, greed, politics ) with absolutely bullshit, impossible promises.

3. Later, AFTER the deal is sealed, some MBA dink at the firm will approach one of the software architects and say, "what is your estimate to do X,Y, and Z. Keep in mind this is a (important client / new client ) and we need to make this 100% perfect!"

4. Architect assesses the situation, and says it will take X time with N number of developers skilled in Y.

5. MBA guy tells architect, we already signed and sealed on doing it with ( < 10% of everything the architect needs to have a chance in Hell of success ), so YOU are responsible for delivering.

6. Team works 80 hour weeks, with no time or budget for testing or QA, and shoves a bug-ridden, insecure product out the door because, hey, it's "working".

7. Client rants and screams, settlements are some times made, and they part ways with the consulting firm. The vaunted partnership ends.

8. Client repeats step 1, several million or more poorer, MBA guy gets a bonus, developers quit and hope to find a better job, leaving behind and undocumented nightmare for the next unfortunate developer.

It fails because automated testing is bloody expensive to start, and more importantly, expensive to maintain because it requires an expert to modify code every time the code changes (a paper change to a test procedure is vastly cheaper).

It also has a bad habit of finding thousands of non-conformances that are technically bugs, but don't actually impact final user experience. (Kind of like using lint for the first time)

Do they add reliability? Unquestionably yes. But the question is not "do they improve reliability?", it's "do they improve reliability in a way that increases revenue to make up for the increased costs?"

And the answer to that is often no.

Experience has taught me that except for edge cases (super high quality boutique software or no-quality *total* software catastrophe), customers are often unwilling to pay a lot more for quality code (and quality code doesn't come for free), and as long as you produce reasonable code (and pay the appropriate lip service to high quality code), they won't dump you. As a software tester who takes each bug that gets past me personally, the idea that management knew what was better for the company all this time can be a bit hard on my hubris :-).

You have to remember what it was like BEFORE agile process. 1/3 of your time writing analysis documents, 1/3 writing code, 1/3 testing and debugging. And the quality was STILL terrible. And it it STILL took twice as long as the original estimate. Or software development teams where the ratio of testers to software developers was 2 to 1, or even 3 to 1. And the quality was STILL terrible.

To say nothing of the reality of maintaining legacy code, where developers produce 150 lines of production-ready code a month, instead of 2,000 lines of production code a month, and everyone's terrified of doing the right thing because that team of 15 testers that were present to run the original regression test suites have long ago moved on, so there's no way that any reasonable fix can be tested.

The new reality: 1/2 your time writing code; 1/2 your time writing (and maintaining) unit tests. That's still 33% more efficient, and the improvement in quality is enormous.

If you're working in environments where the unit code tests aren't being written, make a case for the relative cost of doing old-school quality control. There's a quid-pro-quo there. If you don't pay the pro-quo, you get what you deserve.And the pro-quo is either 100% development overhead for automated unit tests, or 200% development overhead for heavy process. Total failure is another option, I suppose.

Management of expectations is important. Everybody needs to understand that per-developer development velocity is going to drop by 50%. That's the development cost of doing unit testing properly. But that drop in per-developer velocity is more than compensated for by elimination of analysis and test overhead.

In my experience, given ample time, all the developers I've worked with would much rather do the right thing. Software developers, are by nature, craftsmen, and take pride in their work. If developers are cutting corners, then that's a symptom of unreasonable expectations being forced on developers somewhere along the management chain. So fix that. First. You may also need to convey that message to your developers. Unit-tested code takes longer. Developers need time to adjust to that reality, and scheduling may be totally out of control during the transition. So prepare your developers for that reality, and help them through the transition -- not by making them do twice as much in half the time, but by learning to include unit-test development time in scheduling estimates.

How do you MAKE people write appropriate test code? You need to impose an immediate cost to developers for checking in code without appropriate test coverage. Gated checkins is definitely a good solution -- code review, and review of unit tests, code doesn't make it into production builds until another developer on the team has approved the check in, and/or failed code reviews get rolled back. Auditing, and rolling back of change-sets that don't have appropriate unit tests might be another way to intervene in teams that are really out of control. That's the stick approach. The carrot-management approach is to deny short-term reward for code without unit tests. No task moves to complete state until unit tests have been checked in, reviewed, and run in a production build.

Automated measurement of code coverage during unit tests also provides a good technique for monitoring bad checkins. It's difficult to get 100% coverage; but rolling back commits that have less 80%+ or 90%+ code-coverage in unit tests would be a reasonable intervention.

For what it's worth, in .net world, if code contracts are used, it's actually quite feasible to get 100% coverage if code contract analysis is part of the automated test procedure. Automated test tools used to verify code contracts can efficiently exercise defensive code that's awkward to cover in manually written unit tests. Happily, inline code contracts also end up inline with code under test, so they tend to be easier to maintain as code evolves.

Automation is not for every shop... (and I am an automated test consultant!)

For my experience, automated testing only justify if:

- a very big software that have many criticial functionalities that do NOT change everytime and must be tested everytime a new release, or version migration, or patch is planned to happen - if this is the case it is much easier to justify the maintainance cost of the automated test scripts;

- automated test should be able (or the system under test should provide) a way to automate its own data in order to test the application, otherwise you will need someone to generate data to be used by the robots - everytime you put humans behind the robots you add an extra complexity to automation that is not good;

- automated tests should start small enough to conquer the confidence of the development and test comunity that its results are reliable and management decision could be done upon its results - it cannot be sold as a silver bullet;

There are many more reasons to justify, but these are the ones that come to my mind now.

Another very good reason: the application has very complex integrated tests (jumping from inventory to accountability to sales, etc...) and a manual test would involve a lot of "Humans" to do the task. Automating these tasks improve a lot the speed of the tests, eliminating the time spend between the human interactions (a person finishes a test, send an e-mail to another person telling that he can continue the test - this interaction could take 5 minutes till 2 hours or more! - robots does not need to wait to continue the test).

Many of the other replies contain insightful and useful bits of data. However, many of the replies also contain plenty of satire and humor too.

The one aspect to this article and all replies which seems missing are either suggesting, or at very least questioning what having a test-focused developer working on the project would do.

At very least, use another developer on the team for the testing process.

By forcing a design contract between two people, it forces a design review and collaboration process. Code reviews are so common, yet somehow design reviews don't come up as often. Note, I do not consider presentations or reviews with partners, customers, or clients to meet this goal. Their motivations are going to be quite high-level and specific only to their goals. Software development is complicated, yet also nuanced. Another expert developer knowledgeable and experienced in the field is required to have a proper review of the design contracts being committed to. Too often have I seen instances where developers were charged with their own testing, assumptions and short-cuts were taken such that while not necessarily dishonest, the resulting tests would not necessarily be written with the customer/client's best interests in mind.

Even better, by having a role dedicated to testing these designs, their motivations are allowed to stay focused on keeping quality high, keeping schedules on track, keeping automation overhead low, and most importantly keeping the project appealing to the end user.

I wish that the test development role (no, I don't mean just QA, I really mean developers 100% focused on automated, reliable, and high quality testing) were more prevalent in the new days of agile software development. Adding this second role allows for better parallel testing as well without needing to double schedules to have proper test code.

While not directly related, this also is a great opportunity to point out that having a third and again separate design role provides a set of checks and balances in the development process (call this architect, call this designer, call this program manager... all can serve the role). This role then provides an independent design contract to both dev and test equally. Each role can then simultaneously start on development against said contract as well and increase collaboration (and cross-checking) as the schedule progresses. It's virtually guaranteed that one or more of the parties will have some disagreements while interpreting the intention of the overall goal also. This model allows for ties to be broken as well as provides for a structured review/discussion to happen when questioning these interpretations.

One further benefit as well is that developers focused on automation and testing can further develop reliable automation frameworks (gates if you will, as mentioned above) in between projects (or during) thus freeing "developer time" to remain focused on solving complex code design / efficiency problems.

I really wish that there was more discussion of such a role and process in today's software development industry. I have met several developers that can write honest and reliable tests, but hardly have I met a developer who truly understands exactly HOW end-users will use the product after it's built.

You have to remember what it was like BEFORE agile process. 1/3 of your time writing analysis documents, 1/3 writing code, 1/3 testing and debugging. And the quality was STILL terrible. And it it STILL took twice as long as the original estimate. Or software development teams where the ratio of testers to software developers was 2 to 1, or even 3 to 1. And the quality was STILL terrible.

To say nothing of the reality of maintaining legacy code, where developers produce 150 lines of production-ready code a month, instead of 2,000 lines of production code a month, and everyone's terrified of doing the right thing because that team of 15 testers that were present to run the original regression test suites have long ago moved on, so there's no way that any reasonable fix can be tested.

The new reality: 1/2 your time writing code; 1/2 your time writing (and maintaining) unit tests. That's still 33% more efficient, and the improvement in quality is enormous.

If you're working in environments where the unit code tests aren't being written, make a case for the relative cost of doing old-school quality control. There's a quid-pro-quo there. If you don't pay the pro-quo, you get what you deserve.And the pro-quo is either 100% development overhead for automated unit tests, or 200% development overhead for heavy process. Total failure is another option, I suppose.

Management of expectations is important. Everybody needs to understand that per-developer development velocity is going to drop by 50%. That's the development cost of doing unit testing properly. But that drop in per-developer velocity is more than compensated for by elimination of analysis and test overhead.[...]Automated measurement of code coverage during unit tests also provides a good technique for monitoring bad checkins. It's difficult to get 100% coverage; but rolling back commits that have less 80%+ or 90%+ code-coverage in unit tests would be a reasonable intervention.

For what it's worth, in .net world, if code contracts are used, it's actually quite feasible to get 100% coverage if code contract analysis is part of the automated test procedure. Automated test tools used to verify code contracts can efficiently exercise defensive code that's awkward to cover in manually written unit tests. Happily, inline code contracts also end up inline with code under test, so they tend to be easier to maintain as code evolves.

People may stone me for it but I think test driven development is stupid. It wastes so much time and esp. In the beginning when code should change often the stupid unit tests break all the time and double the work of refactoring. Also code gets ugly and unreadable because you need to make it testable and you get the old Hammerfactoryfactoryfactory problem.

So it makes code worse, it makes coding slower it adds a powerful incentive against refactoring (even though it theoretically should make refactoring easier.)

Now no opinion without exception I love and would mandate excellent unit tests for relatively stable APIs. If you have architected your software with a set of good stable apis between major components, make extensive unit tests against them. Those interfaces shouldn't change too often anyway so you have all the advantages but no downsides. But don't make apis for the stupid unit tests make stable apis because they make sense.

unit tests are great if used sensibly but as usual they are a technique that helps coders. But this ttd a it is sometimes propagated is like the tail wagging the dog.

At very least, use another developer on the team for the testing process.

Can't be overstated. A proper test role will dig in, start asking all of the hard questions, and bring to bear a mindset of both verifying the golden paths for the end user work every time and blowing away every assumption a developer has made. They often seem a better advocate for the customer than the customer themselves.

People may stone me for it but I think test driven development is stupid. It wastes so much time and esp. In the beginning when code should change often the stupid unit tests break all the time and double the work of refactoring. Also code gets ugly and unreadable because you need to make it testable and you get the old Hammerfactoryfactoryfactory problem.

The problem there is the mindset that everything is a factory. Designing for testability does not mandate that everything becomes a factory.

Designing for testability makes you think about the dependencies. Some dependencies are an implementation detail (e.g. using a csv parser in a settings class) and do not need mocking/stubbing via a factory or some other mechanism. Other dependencies need thinking about and may be solved in different ways.

For example, if the code calls FooFactory::CreateFoo() to get an instance of Foo. maybe instead pass an IFoo to the constructor. Then, the code is not dependent on FooFactory and you can pass different IFoo implementations in the same test executable.

I don't buy the wasting time early on argument. I have written a tree class using a rapid code-build-run cycle with the code that was run being the tests. I built the class and test code in tandem. Here, I included tests for memory leaks (by testing constructor and destructor calls). This helped me uncover a bug when moving from new/delete to std::allocator.

JPan wrote:

So it makes code worse, it makes coding slower it adds a powerful incentive against refactoring (even though it theoretically should make refactoring easier.)

Not so. I have refactored code under test that has mimetype detection from using magic-style checks to using the shared-mime-info database. With the tests in place, that found several bugs during the refactoring that would have been harder and more time consuming to verify without.

I have also refactored a document processing model from a SAX-like event push model to a reader-based event pull model, with the document processing under test. For this, I have a test program that takes the document input and produces a series of events as output. I then compare the output the program produces to the expected output. Doing it this way makes it easier to add new tests.

JPan wrote:

Now no opinion without exception I love and would mandate excellent unit tests for relatively stable APIs. If you have architected your software with a set of good stable apis between major components, make extensive unit tests against them. Those interfaces shouldn't change too often anyway so you have all the advantages but no downsides. But don't make apis for the stupid unit tests make stable apis because they make sense.

unit tests are great if used sensibly but as usual they are a technique that helps coders. But this ttd a it is sometimes propagated is like the tail wagging the dog.

The problem with writing tests after the fact is (a) getting the time to do so (managers will want you to work on the next piece of functionality) and (b) remembering how the program works. Also, if your testing uncovers bugs, it may be harder to justify fixing them at that point.

Yes, writing tests takes time and they require effort to maintain. However, they will pay off in the long run.

Consider an application that writes settings to the Program Files folder in Windows. That program will fail on Vista due to the permission change. If you have automated tests for this, it is easy to detect the issue and track down the problem. If you have manual testing, the process of investigating and reproducing the problem is a lot harder and takes up more time in the long run.

Also, consider problems that result from things like compiling in debug vs release, or compiler optimisation bugs/differences, or switching compiler, or running on a different platform.

If you do not have tests in place, you cannot justify modifying any code produced as you don't know if any changes will break the code. Therefore, you end up being conservative with changes and adding new code without looking at the bigger architectural picture. Soon, you will end up with a lot of different ways of doing similar tasks and the code becomes harder to maintain and productivity slows.

The problem is that the TDD concept is so extremely alluring. It sounds neat, quality-oriented, sophisticated. But like all utopias, TDD heaven always seems to be just around the next corner. I don't know how many times I have heard or read developers say "I am very positive about TDD, but..." And I'm sad to say that I have personally never seen TDD work well for large-scale software projects.

(Big disclaimer: I am no veteran dev, and have barely ~10 years of professional experience as a developer.)

This is my experience:

Sometimes it's like the SW development team gets stuck on the "initial threshold", where costs are high and benefits are near. But somehow, the cost never goes down and the benefits never really come. New testing frameworks, webcasts/webinars/courses, changed frameworks, new build servers, etc, like a stuttering car that never gets up to speed.

Other times the team gets past the threshold, but churns out large amounts of trivial tests that come with no benefit, just added maintenance overhead. First months everyone cheer as code coverage goes up and up, then after a while mood sours as merges and refactorings somehow become more error-prone, and small nuisances like when people have to fix the-other-guy-who-just-left-for-the-weekend's commit since he/she never ran the unit tests locally before committing, and errors were discovered on the build server. Etc. Small nuisances + small benefits = everyone loses interest and TDD slowly dies.

And yet other times a TDD thrust leads to ambitious rewrites of old legacy code in order to make it testable, instead of actually writing the tests. But then you're basically off chasing that software "castle in the sky" - never a good idea to begin with, and eventually management intervenes.

So the basic problem is that most large software projects live in this dirty, TDD-unfriendly reality where you have large chunks of semi-legacy code, where the architecture is not TDD-adapted, where you or your team members make mistakes in the code injections/interfaces/data models/architecture, and where the management focuses more on sales than on beautiful code.

So - I am in principle very positive about TDD, but does it work? Really? Please convince me that I'm wrong. I want to believe.

In my experience, it's definitely a time / cost issue. This is especially so in the world of consulting where you have the perfect storm of lies, greed, idiocy, and politics all conspiring against quality and security. In a nutshell, it goes like this:

...

Does that sound about right? :-)

You speak to my life as a software engineer for the last 4 years. *cries*

If they're painful to maintain, it's highlighting that the application is poorly designed and is also going to be painful to maintain. I'd also extend that to listen to your tools. If you run lint (or any other code quality tool) and get 100s of warnings, it's a sign.

I've worked on some horrendous legacy systems where inheritance was favored over composition (14 levels of inheritance in views), they had >1000 line methods, and a class called Global (which everything was tied to). Shotgun surgery and yak shaving were practiced every single day. On that system, changes that should take 10 minutes (using TDD) would often take days and sometimes extended to weeks.

In that sort of system, doing TDD sucks. But only because the SOLID principles weren't applied. When the SOLID principles are followed, TDD becomes significantly easier. Ironically, starting with TDD helps to ensure the SOLID principles are applied. Even more so when you use the tests to DESIGN your classes (which is why it's called test driven design).

If you're working on a legacy system, TDD isn't going to work without a LOT of refactoring. That can be very painful. But when using it on new systems, it can lead to very flexible systems that are easy to maintain. And maintenance is the majority of the cost of any system, not writing it.

I think that unit tests, and by extension TDD, are very useful, but I do think that the proper granularity is a tricky thing to nail down. Too much granularity (lots of small functions, unit tests for EVERYTHING) and you get mired in process, but too little and it gets easier for bugs to slip through.

Also, I think that the dev environment plays a role: unit testing for .NET/Java is a different beast than unit testing for JavaScript (my area of expertise). I think that dynamic languages benefit a lot more from unit tests than statically typed languages, but at the same time they are harder to write for dynamic languages. Dynamic languages may not require types to be specified, but they most definitely still exist, and more often than not some method/algorithm really only works with one or two types, not all possible types. After all, you can't compute the square root of "Hello World", even though it's perfectly acceptable to write "Math.sqrt('Hello World')" and even receive an answer.

We use a lot of unit tests where I work, and we also require that all code check-ins must be peer-reviewed first. Part of this is due to the aforementioned challenges with dynamic languages, and part of it is due to the fact that our product is an API (at least the part I work on). Anytime you produce a library, unit-testing is obvious since that interface is the product. End-user applications aren't so obvious IMO.

The argument that TDD wastes time is because you're not doing TDD, you're writing code and writing tests. In TDD, testing and coding are not mutually exclusive. It's test-DRIVEN-development, not test-included-development.

Tests drive code design. Any garbage about designing the code properly beforehand is ridiculous as well. Nobody could possibly know the design before the code is written. There will always need to be changes. Code needs to be designed in a flexible way so that you can change it easily. When you write tests to drive this design, you make more flexible code. If it's too hard to write a test for a piece of code, then that piece of code is most likely designed poorly. Invert your dependencies and think about what would make that method/class/module easier to test. When it's easy to test, it's easy to change and that is the design you're looking for.

The easiest way to do this is to ALWAYS PAIR. When you ping-pong pair (one writes a test, the other makes it pass, you both refactor, then switch places) you end up writing better code. Hold each other accountable and lead by example. Call people out when they start cowboy coding (it's not heroic or cool, it's irresponsible and unprofessional).

MORAL OF THE STORY: I've worked on a system for the last 9 months that has been test-driven since the beginning. It is very flexible and a pleasure to work in because of this. Follow TDD and SOLID correctly (yeah, acronyms are lame) and you will have a good system.

TDD also cannot work in a hostile management environment. It cannot work if developers are not being given the proper incentives. It takes a while to build a habit -- both on the developer and manager side.

The developers have to write tests, and the managers have to reward them for doing it.

A guaranteed way to make TDD fail is for half your developers to willfully not participate, have management scream at the ones who do try to faithfully do TDD for not farting out the features as fast as they used to, and have management praise the ones who didn't do TDD.

I've been a developer since long before unit testing was invented, and definitely before the idea of "test driven development" started out as a fad and became religion. Let me share a different perspective with you.

When *I* was in school, there was this thing we did called "design". Instead of just hacking code together by the seat of our pants, we took out some paper and pencils and designed what our code modules were going to do. We'd start with pseudocode, maybe do some flowcharting, and get a really clear idea of what needed to be done and how it needed to be done before we fired up our IDE. We would also do any research we needed to do about libraries and techniques before we started coding. Before a single line of code was written, we'd have a rough idea what our interfaces were. This approach has so many benefits: your code ends up being well planned, organized, well thought out, easy to maintain... And your notebook becomes part of the documentation.

One thing I never do is write unit tests before I code. It's my philosophy that until you've finished the work, you don't even really know what you should be testing or why. It's only once you completely understand the problem domain that you can actually write tests that are meaningful.

If you ask me, the problem with "Test Driven Development" is that it approaches the programming process ass-backwards.

Picture a programmer who, for example, has to write a module that receives a text string of unknown length, and returns an audio stream ready to be sent back to a user over the internet. I've written these. They're not that hard. I won't get into the details, but you can't write something like that on your first try with TDD. Oh, sure, you can write a couple of tests. Test that the String is not null; test that the String is of positive length. Test that the String contains only characters allowed via a regexp...

But THEN what? Let's assume that you have no idea whatsoever how to proceed yet, having never written any code like this. So what's your next test? Hmm? How can you write a test when you don't even know what code you need? When you don't know what libraries you'll be using? When you don't know what end-user issues you'll run into, or network issues, or browser issues?

YOU CAN'T. You're going to end up writing the code first and wedging a bunch of tests in later, if you do it at all. Because TDD is unnatural; it's backwards, it runs against the instinctual way people have been solving problems for thousands of years.

The problem is that TDD, and XP, and a lot of these other fads, have turned into RELIGION, not technology. People aren't following them because they make them more productive, or because they actually help the programming process... People are following them because of consensus, like a herd of sheep following some alpha hipster sheep. And nobody's questioning whether any of this stuff is a good idea. Nobody QUESTIONS anything anymore. They just meekly submit.

The first thing that matters is context. To say a particular methodology doesn't work, there has to be some context to that statement. Generally, if you have a pretty good idea of what you're building (i.e. most web application backed by a database that integrate well known services or frameworks) then you really don't need to do a lot of analysis or design. You can start writing tests that demonstrate the expected behavior. In many cases the requirements are known at a broad level but details are missing. Detailed analysis and design may not be very beneficial in part because the designs are either trivial or largely repetitive and based a handful of repeated patterns. For this class of applications, especially ones that use inversion of control frameworks such as Spring, testing before you write code (TDD) can be very effective. But I would never assume it will fix all problems under all circumstances.

For example, in some cases you need to do design before code because you need to demonstrate that the design was reviewed to meet certain safety standards. In other cases, you can't test parallel code (because you can't simulate every possible instruction interleaving over multiple processors) and therefore testing is not as useful. In some cases you may need to do other work before coding, such as analysis to understand the problem better because it has novel or unique aspects that are outside the experience of your team. (For example, you might prototype, sand-table, mock the behavior, etc.) In other cases you would need to write such cumbersome, detailed, exhaustive tests that other verification methods (such as automated proofs) are actually easier and more reliable. In other cases the output is completely visual and requires a human being to interact with the product (games or simulations) in order to make a determination of success/failure. While these are not common situations, any one who tells you that testing (TDD/XP/Pair programming) will always be produce a better outcome, in all circumstances, with all teams, is wrong.

Assuming your context is suitable for test driven development, then you should write tests before code and change your tests before changing code in maintenance. This won't fix every problem. For example, you may write a test that is incorrect, passes, but the related code fails in production. Or, if you don't manage the test data (fixtures) correctly you can wind up with test that fail intermittently. Depending on the complexity of your application, you may need to set up environments for integration testing and will require additional resources and expense (i.e. licensing additional copies of software). You may need to invest time, money and effort to set up services such as Jenkins, dedicated build servers, virtual machines to run selenium, etc. The experience of most people is that they actually wind up working more quickly when they have tests and then implement. Developers also work more quickly when they have a feeling of confidence that their commits haven't broken anything else.

If you're working with certain languages, specifically dynamic languages like Ruby, JavaScript, Python, etc., then there is a real need for testing because the interpreter will not catch many errors that are caught by a compiler with strict typing. It may be necessary, for example, to test that a value returned from a call is not null and can be added to a number (in some reasonable way), whereas Java would catch the invalid return type during compilation. Ada, for example, can use multiple run time checks to enforce an even higher level of safety and even more is caught by the compiler.

Generally, all methodologies show some level of success. For example, big design up front proponents were able to demonstrate that projects that followed the rules were developed in less time with fewer bugs. In most cases, if you don't follow the rules (or your context is unsuitable) whatever your chosen approach, it will not work well. Following the rules is a function of setting incentives (such as making the first question in the code review 'where are the tests?') It is also buy-in from management to avoid cutting corners. (We'll write the test later, after delivery, ALMOST NEVER HAPPENS). It is also education. Do you make sure every new developer knows your test frameworks and has a working test environment? So either follow the rules or find something you can follow. It may be that your team achieves unheard of productivity when they can see the software they will build in terms of UML taped to a wall and that forcing them to write tests before code will not click with them. However, you're generally better going in whole hog (so to speak) than half way.

get a really clear idea of what needed to be done and how it needed to be done before we fired up our IDE.

One thing I never do is write unit tests before I code. It's my philosophy that until you've finished the work, you don't even really know what you should be testing or why. It's only once you completely understand the problem domain that you can actually write tests that are meaningful.

I feel like you contradict yourself in these 2 statements. How do you have a really clear idea of what you want to do before firing up your IDE, without having understood your problem domain? If you don't even know your problem domain before having completed your design, your design is bound to fail.

@Arcadium: I'm not exactly contradicting myself; before you start coding, you should have a pretty good idea of what you're going to be doing. Everything should be laid out for you; you've got a roadmap of sorts. But once you actually start coding, THEN you're going to start seeing issues you'll need to deal with. During the design phase, you're doing research, but you don't know exactly, precisely, what you need to write, see?

It's like they say in the military: "No plan survives contact with the enemy".

Design first, then code, then test. Do it that way, and your tests will be much better than the ones someone tries to write before they think about code...

The process moves from the general to the specific. Knowing what to test for and why is very specific.

@fullwedgewhale: [Edit: I just re-read your post and I agree with it almost entirely. The previous version of this post didn't make sense to me after re-reading your post, so I dropped it, and I'll just say you make a lot of sense, and I agree with you. It's late at night and I'm getting bleary-eyed. ]

By the way, not to pick a fight here, but the idea of "pair programming" is absolutely crazy if you ask me. Look at it from this old timer's perspective:

Normally, you have two programmers, at two separate workstations, both working on their own modules. If each programmer is capable of writing, say, 2500 lines of code a day, at the end of the day you've got 5,000 lines of code and two separate modules done.

With pair programming, you've got those two programmers sitting at ONE workstation, with one programming while the other bickers at him like a wife in the passenger seat of a car. "Not that way!" "Not that way either!" "No, the other way!" "Stop and ask for directions!" Every time the first programmer writes something, he's got to justify it to the wife. Not only are you not going to get 2500 lines of code out of both guys, you're not going to get it out of the FIRST guy! You'll be lucky if you get half that. So right off the bat, you've gone from two modules and 5,000 lines of code per day, to less than one module and around 1500 lines of code. That's HORRIBLE.

Alas, poor productivity! I knew him, horatio.

This ties in with my first comment in this thread: People treat these concepts like religion, and never question them. But that's crazy. Any programming methodology is just an idea, and you SHOULD question it. You know... "If you see the Buddha on the road, kill him".

Humans make poor machines, just as machines make poor humans. Know, for the task in hand, which you need and apply the correct choice of human or machine. Do not waste your time trying to coax humans into being mechanical: human being is not a mechanical form of being, and being a machine, whilst interesting for a short while, especially given an incentive, ultimately becomes boring and distractions will take precedence in the human's mind. Humans just work this way, so deal with it.

Let's clarify something: automated tests do not imply unit tests and unit tests do not imply TDD. You can have any of these without the others.

An automated test is any test that is run automatically by a machine. This can be anything from a unit test to a full-blown GUI automation test of the complete system.

The key benefit to have automated tests in place is to detect regressions early on. By running them on each build, on each commit, you can immediately spot what change broke some behaviour (application crashing on startup, incorrect calculation, etc.). Without them, you need the QA/test department to go through all the functionality of the program to ensure that nothing has broken; this usually occurs after specific milestone builds (e.g. once a week or two) that could include a large number of changes. Fixing issues that late on is more costly than at the point where the bug occurs.

The term unit test has several meanings. To some, it is testing a small unit, but then what is a small unit when you are testing a function that depends on a large number of classes/other methods? To others, it is a test that has no dependencies on the environment (file system, registry, network, etc.) so failures are only due to the test code or the code under test.

I personally prefer the term component tests as what you are testing is the component and all the components it depends on to work. For example, if you have an Account.calculate_balance method, your Account class may depend on a Money and Transaction class. It does not make sense to stub/mock the Money and Transaction classes as you are then changing the behaviour of the code under test and the test is meaningless.

Where you need to provide mocks/stubs is anywhere you pass interfaces to the class. For example, if you have an Account.load method that takes an IStream COM object you can control the data passed to it, and you can check error handling by having the different methods returning specific test-controlled error codes. Likewise if you have a music application that has a plugin architecture to read metadata (artist, title, genre, ...) from the file you can control what the plugin returns by setting a test plugin.

Test Driven Development (TDD) is about writing the tests and code in tandem. The idea is that the tests are the first user of the code. This does not negate the need for design, it compliments it -- you are writing tests to cover each part of the design and then writing the code to meet that design requirement (e.g. adding the same value to a set only keeps one of that value). There is another testing methodology called Behaviour Driven Development (BDD) that focuses on the requirements and use cases -- describing those in a way they can be verified. BDD ended up in the same place as TDD even though it approached it from a different angle.

Whatever testing methodology you use is largely down to what works best for you. For some things, you will want to write the code first and not create the tests until you have something in a usable state. Here, you will likely use manual testing (by the developer) to ensure it is working. Then add the tests. For example, if you are creating a JavaScript engine or something similarly complex, it does not make sense to have tests until you have the code in place. Here, you can focus on the minimal functionality (e.g. processing "function test() {}").

The important thing is to have automated tests in place that cover as much of the code as possible as early on as possible.

It is perfectly valid to write the code using a simple working algorithm, use that to generate test data, check the test data is correct manually then add that into an automated test. This then allows you to optimize the algorithm as needed, or try different algorithms (e.g. different sorting algorithms) or different data structures (e.g. std::list vs std::vector or std::map vs std::unordered_map vs std::unordered_map with a custom hash algorithm) to see which one performs best for your setup.

Also, your automated tests should cover negative testing -- that is testing for unexpected behaviour. What if you pass a null pointer to an interface method? What if you pass an invalid range (e.g. 8.pi to a sine function)?

First months everyone cheer as code coverage goes up and up, then after a while mood sours as merges and refactorings somehow become more error-prone, and small nuisances like when people have to fix the-other-guy-who-just-left-for-the-weekend's commit since he/she never ran the unit tests locally before committing, and errors were discovered on the build server.

Sometimes it's like the SW development team gets stuck on the "initial threshold", where costs are high and benefits are near. But somehow, the cost never goes down and the benefits never really come. New testing frameworks, webcasts/webinars/courses, changed frameworks, new build servers, etc, like a stuttering car that never gets up to speed.

The problem here is not testing, but chasing every new framework/technology to come out.

Failing to police these will only lead to trouble -- mixing NUnit and MbUnit test projects; CppUnit, Google.Test and Boost.Test tests; having Spring.NET and Unity for different parts of the system; ....

The best approach would be to have a small team investigating and learning new technologies and frameworks related to the technologies, platform and problem domain the companies applications use. This team will regularly report progress. This team will provide rationales for any migration decision (e.g. needed to support a new platform) along with the associated risks and a migration strategy. This would then be reviewed by the manager and architect leads for each affected group. A decision would then be made and implemented if the migration plan is accepted.

thecosimist wrote:

Other times the team gets past the threshold, but churns out large amounts of trivial tests that come with no benefit, just added maintenance overhead. First months everyone cheer as code coverage goes up and up, then after a while mood sours as merges and refactorings somehow become more error-prone, and small nuisances like when people have to fix the-other-guy-who-just-left-for-the-weekend's commit since he/she never ran the unit tests locally before committing, and errors were discovered on the build server. Etc. Small nuisances + small benefits = everyone loses interest and TDD slowly dies.

Depends on your definition of trivial tests with no benefits. Even a simple min(a, b) function can be implemented incorrectly if you get the comparison sign wrong.

Merging issues are not a result of TDD, but your branching/merging strategy. You will have the same issues when implementing a major new piece of functionality that affects multiple areas, or upgrading the toolchain.

If refactorings are too error prone, you are not doing them correctly. See http://reecedunn.co.uk/cainteoir/tts/20 ... art-1.html -- basically, each refactoring should be isomorphic (can be applied in reverse -- rewrites don't count), small, self-contained and complete. Yes, errors do happen. With proper refactoring, it should be easy to spot the mistake. Even better, the mistakes should be picked up when running the tests (which, being run on every build point you to the faulty commit) or during code reviews (you do have code reviews).

Also, the unit tests should be part of the build process so the developer will run them when they build the code locally.

And your example about the dev leaving for the weekend also applies to commiting a change that does not build (e.g. not checking in a new file), or where the application fails to run. These are not automated testing issues, they are policy/workflow issues.

thecosimist wrote:

And yet other times a TDD thrust leads to ambitious rewrites of old legacy code in order to make it testable, instead of actually writing the tests. But then you're basically off chasing that software "castle in the sky" - never a good idea to begin with, and eventually management intervenes.

So be pragmatic about it. Introduce tests when working on functionality in that area, or fixing a bug in that code, or for new code. Don't expect a 5-10 year codebase to automatically become testable overnight.

I find TDD to be a flawed concept to begin with. Your coding should be interface-driven; not test driven; and interfaces come from your uses cases.

I think TDD got hold and appeared to magically solve problems because people just would not write simpler code. And messier code means messier bugs. By requiring the developers to write tests, it forces them to write smaller and self-contained pieces of code. That automatically leads to more manageable and well-factored and loosely coupled code; which leads to better quality.

The only value in forcing TDD is that it forces you to write code as per the guidelines in the design patterns. The quality of code increases not because of presence of tests, but because it inherently got written better.

The tests themselves actually add to the complexity and size of the code, easily making your coding effort 2x or 3x. And any code - be it tests or main feature - that gets written needs to be debugged and maintained. Thus any code is a liability and the goal should be to write as little code as possible. Rather than relying on own-written tests, there are better ways to achieve 'level-of-confidence'; such as building software via layers of abstractions.

Another point; writing meaningful test code is almost always far from obvious; even to the best of the programmers. Your tests are only as meaningful as your understanding of the problem. While your code may have a clear cut boundary (aka interface to work with), the tests cases are pretty much open-ended; often limited by only how far you are willing to prod. When mandated, most developers would just roll in something to satisfy the requirement; thereby adding to the false sense of safety.

So instead of TDD, the mantra should be to build abstraction based code; that anything and everything that can be abstracted, should be abstracted into a library and over time, layers of such abstractions should get built.

When *I* was in school, there was this thing we did called "design". Instead of just hacking code together by the seat of our pants, we took out some paper and pencils and designed what our code modules were going to do. We'd start with pseudocode, maybe do some flowcharting, and get a really clear idea of what needed to be done and how it needed to be done before we fired up our IDE. We would also do any research we needed to do about libraries and techniques before we started coding. Before a single line of code was written, we'd have a rough idea what our interfaces were. This approach has so many benefits: your code ends up being well planned, organized, well thought out, easy to maintain... And your notebook becomes part of the documentation.

I took a short tiny course in C once (when reading electronics) and this was what the teachers said we should do if we would ever find ourselves in an actual project and not just program LED dices. We ended up doing flowcharts anyway though, even if the code was tiny, but the teacher really wanted us to take to it. This was in 2009. So I have kinda assumes this is how it's done, but judging from your comments (and others here of course) it probably isn't.

Sometimes it's like the SW development team gets stuck on the "initial threshold", where costs are high and benefits are near. But somehow, the cost never goes down and the benefits never really come. New testing frameworks, webcasts/webinars/courses, changed frameworks, new build servers, etc, like a stuttering car that never gets up to speed.

The problem here is not testing, but chasing every new framework/technology to come out.

Failing to police these will only lead to trouble -- mixing NUnit and MbUnit test projects; CppUnit, Google.Test and Boost.Test tests; having Spring.NET and Unity for different parts of the system; ....

The best approach would be to have a small team investigating and learning new technologies and frameworks related to the technologies, platform and problem domain the companies applications use. This team will regularly report progress. This team will provide rationales for any migration decision (e.g. needed to support a new platform) along with the associated risks and a migration strategy. This would then be reviewed by the manager and architect leads for each affected group. A decision would then be made and implemented if the migration plan is accepted.

Agreed. But my point was not that chasing new frameworks is somehow inherent to TDD. It was more of an observation, that this in fact often seems to happen. There are many frameworks out there and picking and introducing the right framework comes with a cost, and even the best developers often (right or wrong) backtrack to switch to a different framework. That is not unique to TDD, of course, but it IS an additional cost. And with additional costs - developers' time is pretty expensive - there 's always someone that wants a quantifiable return on that cost.

msclrhd wrote:

Even a simple min(a, b) function can be implemented incorrectly if you get the comparison sign wrong.

Merging issues are not a result of TDD, but your branching/merging strategy. You will have the same issues when implementing a major new piece of functionality that affects multiple areas, or upgrading the toolchain.

If refactorings are too error prone, you are not doing them correctly. See http://reecedunn.co.uk/cainteoir/tts/20 ... art-1.html -- basically, each refactoring should be isomorphic (can be applied in reverse -- rewrites don't count), small, self-contained and complete. Yes, errors do happen. With proper refactoring, it should be easy to spot the mistake. Even better, the mistakes should be picked up when running the tests (which, being run on every build point you to the faulty commit) or during code reviews (you do have code reviews).

Also, the unit tests should be part of the build process so the developer will run them when they build the code locally.

I disagree on this one. I'm very skeptical that writing and maintaining unit tests for min(a, b) is a good idea. That, in my experience, is exactly something that can kill off automated testing or TDD attempts. If you don't immediately discover that the comparison sign is backwards, then you are not performing even the most basic execution tests on your feature code. Adding a unit test for such stuff is wrong. If you are really serious about testing all min(a, b) methods, you will have a growing amount of trivial code that adds no value to the product, and increasing build/test times. After a while, when it starts to take 5 minutes or more maybe, some developers will start skipping running the tests locally before committing. Again - my experience, not what I think is the right thing to do.

So sure, there are all kinds of follow-up questions to this. 5 minutes to build and run tests? Damnit, you're doing it wrong, you have to modularize! OK fair enough, so then some guys start splitting up legacy code in modules. Which takes more time and probably leads to more ripple effects. A new build automation tool. A different build server. New project architecture.

And here is the problem: All of these changes are GOOD in themselves (probably). So no one can come up with a good argument for why e.g. one should keep on using Ant or keep the old build server, etc. But that's the thing: All of this takes a lot of time, a lot of cost. In my experience, often an uncontrollably growing cost. When you or the management fail to see a return on all these costs (apart from assurances that the code is so much cleaner) people slowly lose interest or management abruptly cuts off the TDD.

msclrhd wrote:

And your example about the dev leaving for the weekend also applies to commiting a change that does not build (e.g. not checking in a new file), or where the application fails to run. These are not automated testing issues, they are policy/workflow issues.

I know, I agree. But my point is that it seems to happen anyway and that it is an increased cost. The dev may be in a hurry and may (irrationally!) not want to perform the whole build-test cycle. Where the test cycle is a small 2-minute addition to each iteration. The dev should ideally of course never be in a hurry, if they are in a hurry they should not commit stuff. But still it happens...

msclrhd wrote:

thecosimist wrote:

And yet other times a TDD thrust leads to ambitious rewrites of old legacy code in order to make it testable, instead of actually writing the tests. But then you're basically off chasing that software "castle in the sky" - never a good idea to begin with, and eventually management intervenes.

So be pragmatic about it. Introduce tests when working on functionality in that area, or fixing a bug in that code, or for new code. Don't expect a 5-10 year codebase to automatically become testable overnight.

I agree, very good point. That's the approach where I think automated tests have given the most return on investment. When you're carefully selecting problem-prone areas in the software, areas that are bug-prone or unstable or changed often. That's why code coverage is so misleading. Pick the right area and be pragmatic about it.

And as other posters have noted, TDD cannot be a good substitute for good architectural design, good interface and data model design and a good specification.

Finally: I have yet to see an automated test catch a single relevant bug. It just never happens. The real bugs, that actually irritate users and need to be fixed, typicallly outsmart both the common business logic AND the automated test code.

@msclrhd While I agree that nothing in "test first" or "test driven" should not preclude design, many people who buy into test first no longer see value in design. @KevinL's comment is the "straw man" position on test-first. Where I work the division seems to be about age 30. Below 30 the mantra is test then code. I've actually seen some developers roll their eyes when design comes up as a possible activity. Above 30 is analyze, design when needed, tests and code. I say design when needed because some parts of Spring produce trivial designs (a controller that calls a service that calls a DAO). Drawing the same 3 or 6 boxes (depending on if you are strict about writing an interface before the implementation) has little value. But that's specific to the context in which I work.

I do disagree that tests should just be written in tandem. Generally, once the work is done, developers rarely do a good job of writing tests to cover their code. I think the worst testing is done as an afterthought because someone's checking a code coverage tool, like Sonar, to see that code is 'covered.' Sloppy tests are produced when just trying to move the coverage needle and often fail to actually test anything. The tests pass as 'green' but may only check to see that a value is returned but should actually test the range of the value, or (for dynamic languages) that the value is of the correct type. For example, a test does not get written for the branch in code that the developer comments "// this should *never* happen" but winds up happening and failing. That test doesn't get written because it doesn't move the coverage needle to cover an empty branch but the test was still important.

" The main goal of automated unit tests is to find out if changes broke the code. However, 99 percent of the time what ends up breaking are the tests and not the code."

The goal of unit tests is supposed to test the public interface, which should rarely change. During the system's design phase, most of your units and how they will work should be defined. If your public interfaces keep changing and require unit tests to be changed, then you need a better design phase.

Additional features should mostly be modular and require little to no changes to the rest of the system.

Design it well in the beginning and you'll save time.

Can I get a "hell yeah!"?

This was my first thought as well, architecture and design > automated unit testing if for no other reason than that proper architecture / design make writing tests easy.

BTW for those who think that code can be too trivial to test, I work with someone in QA that said they had a 1 statement program that failed testing once. It was assembler and was supposed to execute just one op code. However, it did not set the appropriate register values prior to returning. So it became a 2 statement program once fixed.