Automated Acceptance Testing

I'm working on some slides for the "Practical DevOps" workshop I'll be giving this week, namely for the testing portion. It's mostly based on Testing Done Right, and one area I'm going into a bit deeper is testing automation. One point I'm making is that code that has 100% test coverage could be 100% worthless without acceptance tests - and you sure as shit can't automate those.

Against my better judgment, I made the mistake of googling "automated accepting tests", only to find the absolute absurdity that is FitNesse.

Automated Acceptance Tests: Building the Right Code

FitNesse automated acceptance tests are power tools for fixing a broken requirements process. Skillfully applied, such tests make it possible to avoid the problems of ProjectDeathByRequirements. (Note: if you have not yet done so, you should probably first get a quick intro on FitNesse tests at the TwoMinuteExample.)

I know that FitNesse has been around for a while now (and hopefully no one actually uses it for "automated acceptance testing"), but I had to take a moment to appreciate yet another example of a poor solution to a misunderstood problem created by the "best and brightest" in our industry.

At some point, usually late in the project, the team discovers that among other problems, they are finding one or more of the following problems with the features being delivered:

They are not exactly what the customers/analysts/product managers think they asked for.

They are not exactly what those folks wanted or needed.

They are not useable by the system's eventual users.

...with the solution being to automate (acceptance) tests to ensure that the right thing has been developed and works as it should.

Surely if you can define the inputs and expected outputs for Fitnesse then you can define them in the Business Requirements documentation. I can't see how Fitnesse can possibly fix the three points listed above, especially the first two which stem from a disconnect between what was asked for/needed by the users and what was understood to be required (and therefore delivered) by the developers.

At some point, usually late in the project, the team discovers that among other problems, they are finding one or more of the following problems with the features being delivered:

They are not exactly what the customers/analysts/product managers think they asked for.

They are not exactly what those folks wanted or needed.

They are not useable by the system's eventual users.

...with the solution being to automate (acceptance) tests to ensure that the right thing has been developed and works as it should.

Surely if you can define the inputs and expected outputs for Fitnesse then you can define them in the Business Requirements documentation. I can't see how Fitnesse can possibly fix the three points listed above, especially the first two which stem from a disconnect between what was asked for/needed by the users and what was understood to be required (and therefore delivered) by the developers.

Ah, but that's the beauty of it all.

As you can see in the [url=http://fitnesse.org/FitNesse.UserGuide.TwoMinuteExample]Two Minute Example[/url], Fitnesse [i]automatically finds erroneous requirements for you[/i]!

How? By testing the requirements against the product, that's how!

@TwoMinuteExample said:

What about red? A cell turns red when we get back a different value than what we expected. We also see two values: the expected value and the actual value. Above we expected 33 back when we divided 100 by 4, but we got back 25. Ah, a flaw in our test table. That happens!

Against my better judgment, I made the mistake of googling "automated accepting tests", only to find the absolute absurdity that is FitNesse.

I wouldn't call it an absolute absurdity. It appears to be trying to be an integration testing suite, which is a reasonable thing. I think we all agree that marketing it as acceptance testing is a WTF. The idea of allowing someone other than developers to be able to specify tests definitely has some value. I'm not interested enough to dig through to figure out how well it does integration testing, however.

The idea of allowing someone other than developers to be able to specify tests definitely has some value.

Having an independent programmer working on it makes more sense, especially if they're the sort of person who likes to actually work with users to understand what they really want. Having it be a programmer (even if their job title says “test engineer” or something) means that they will make tests that actually make sense, and having it be so that they actually work with users mean that the tests check something meaningful from the perspective of the overall project.

Alas, most programmers are not fond of actually understanding users and their requirements. That's the ultimate fount of so many WTFs…

FitNesse is an open source project. The code base is not owned by any company. A lot of information is shared by the FitNesse community. It's extremely adaptable and is used in areas ranging from Web/GUI tests to testing electronic components.

I wouldn't call it an absolute absurdity. It appears to be trying to be an integration testing suite, which is a reasonable thing. I think we all agree that marketing it as acceptance testing is a WTF. The idea of allowing someone other than developers to be able to specify tests definitely has some value. I'm not interested enough to dig through to figure out how well it does integration testing, however.

Don't bother. It's just a way of handling integration testing that's far more annoying than normal.

For these fancy test tables to work, you have to write a lot of crap to support them. So, essentially, you do all your test setup and teardown work in a normal test framework, then chuck in a whole bunch of test helper methods, then you can make the table out of those.

Basically, you end up with something that is much more annoying to write and maintain than regular integration tests, but is too close to the actual implementation for it to ever be useful or interesting for the people writing requirements.

It's like someone took a concept like Cucumber and misunderstood it. Cucumber works because business people can, most of the time, read and understand the tests that describe feature acceptance. Letting them try to write them is a horrible exercise.

Fitnesse was around before Cucumber, IIRC. Both cucumber and Fitnesse address the same issue -- provide a way to run your program, or a portion of your program, with inputs and assertions that are not written using terms, conditions, and rules of the problem domain, in plain english, rather than code. This is in the hope that the tests/specifications are human readable, and can be discussed by both the BA and the developer without having to actually look at code.

Fitnesse, in my opinion, failed at this. The table-driven tests are not human readable (at least not easily) whereas the Cucumber specs look a lot more like the specs that people actually write in their design documents.

But in any case, the utility of the automated acceptance test has been demonstrated over and over.

Fitnesse was around before Cucumber, IIRC. Both cucumber and Fitnesse address the same issue -- provide a way to run your program, or a portion of your program, with inputs and assertions that are not written using terms, conditions, and rules of the problem domain, in plain english, rather than code.

Cucumber was just a random example of something someone might do to help business and tech people understand each other better in acceptance testing.

Well, someone not suffering from blunt trauma or alcohol poisoning. I wasn't suggesting that it was a ripoff of Cucumber.

This is in the hope that the tests/specifications are human readable, and can be discussed by both the BA and the developer without having to actually look at code.

I'm not sure what you're suggesting here.

Fitnesse, in my opinion, failed at this. The table-driven tests are not human readable (at least not easily) whereas the Cucumber specs look a lot more like the specs that people actually write in their design documents.

But in any case, the utility of the automated acceptance test has been demonstrated over and over.

Fitnesse is a failure, not because the tables are hard to read, but because they do exactly fuck all of any use to anyone. Cucumber works because it's just a way to make the functional tests you'd write anyway fit more of a natural language pattern.

It's basically a tool for getting in the way of development, like JIRA or Maven. It's not really designed for a valuable purpose, rather it's more fodder for the sort of obstructive, drooling perverts who get unnaturally excited by spreadsheets.

So I just looked up the definition of acceptance testing from Alex' article:

Acceptance Testing – formal or informal testing to ensure that functional requirements as implemented are valid and meet the business need

What's wrong with automating that? I'm not saying that manual testing can be replaced entirely (neither is Fitnesse), or that Fitnesse is useful for that task (although that wasn't the point), but if I can formally define parts of my requirements and needs, I can automate the testing. Which means, if a developer breaks something, he gets feedback immediately, and not only after someone bothered to click through some old features. So thats good, right?

It's basically a tool for getting in the way of development, like JIRA or Maven. It's not really designed for a valuable purpose, rather it's more fodder for the sort of obstructive, drooling perverts who get unnaturally excited by spreadsheets.

Wow -- written by someone who clearly has an opinion. I'm curious -- gaffer, is this because you haven't actually used cucumber; or because you have used it and it didn't live up to expectations? I'm sure there is a story there somewhere. Please elucidate. Thanks.

So I just looked up the definition of acceptance testing from Alex' article:

Acceptance Testing – formal or informal testing to ensure that functional requirements as implemented are valid and meet the business need

What's wrong with automating that? I'm not saying that manual testing can be replaced entirely (neither is Fitnesse), or that Fitnesse is useful for that task (although that wasn't the point), but if I can formally define parts of my requirements and needs, I can automate the testing. Which means, if a developer breaks something, he gets feedback immediately, and not only after someone bothered to click through some old features. So thats good, right?

Yep, this is entirely possible, just within realistic parameters.

I used Cucumber as an example because it's approaching the problem in a reasonable way.That way is essentially to have people who know what the fuck they're doing write the tests in a way that's more accessible to clients, rather than the Fitnesse approach which seems to include a rather large "magic happens here" step.

It's basically a tool for getting in the way of development, like JIRA or Maven. It's not really designed for a valuable purpose, rather it's more fodder for the sort of obstructive, drooling perverts who get unnaturally excited by spreadsheets.

Wow -- written by someone who clearly has an opinion. I'm curious -- gaffer, is this because you haven't actually used cucumber; or because you have used it and it didn't live up to expectations? I'm sure there is a story there somewhere. Please elucidate. Thanks.

More written by someone who wasn't being too careful with the flow of the post.

I've used Cucumber, and it's been great. There's often a small stumble at the beginning when you realise it's not quite as magic as some people make it look in the Rails screencast type videos, but it's really nice to use once you get a handle on what you need to do and what's done for you.

So I just looked up the definition of acceptance testing from Alex' article:

Acceptance Testing – formal or informal testing to ensure that functional requirements as implemented are valid and meet the business need

What's wrong with automating that? I'm not saying that manual testing can be replaced entirely (neither is Fitnesse), or that Fitnesse is useful for that task (although that wasn't the point), but if I can formally define parts of my requirements and needs, I can automate the testing. Which means, if a developer breaks something, he gets feedback immediately, and not only after someone bothered to click through some old features. So thats good, right?

The key here is acceptance testing is against the actual business need. And that is not a formal specification, it's the actual work that should be done. Whatever you write down to a specification it still remains to be checked that it properly reflects what the users need to do. And that is acceptance testing and can only be done in trial operation.

The key here is acceptance testing is against the actual business need. And that is not a formal specification, it's the actual work that should be done. Whatever you write down to a specification it still remains to be checked that it properly reflects what the users need to do. And that is acceptance testing and can only be done in trial operation.

Emphasis mine. I agree with the sentiment Bulb but I think that properly written requirements should define what the system should do.

(This is how we do it where I work; I'm not defining how it happens everywhere...) The Functional Requirements should define what needs to be tested during System Acceptance Testing (SAT) and the Business Requirements should define what needs to be tested during User Acceptance Testing (UAT) and should be signed off by the business as being a full and accurate description of what the system should be able to do*.

If the Business Requirements are well written then they will provide a complete list of things that the users do that need to be tested. The test scripts can then be written and cross-referenced against the Business Requirements to ensure everything that needs to be tested is actually being tested. UAT does need to be done by the people who will be using the system, although I have been to places where this is not the case.

I've worked as Test Manager on a couple of projects where a talented BA has mapped requirements from users who know what they are doing and it makes formulating the UAT phase a breeze. The tests don't always pass but at least you have confidence that it's doing what it should (or at lest trying to).

*Documenting this properly is obviously the part that takes skill and effort and, as a result, is not always done well.

Seems the actual point is being missed. I use FitNess (though I prefer a similar tool from a different vendor) on a regular basis and it has dramatically reduced the number of defects in certain areas.

The key elements is that many approaches have a separation between the "requirements document" the "acceptance test" and the automation. When ALL THREE come from a single source there can not be conflicts between the three.

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

Anything that's named "Specification By Example" frightens the shit out of me. I've been on the receiving end of too many "by example" spec that have gone horribly wrong. My most recent was an XML interchange format where the example was misinterpreted by the other team as an example text file with funny formatting, causing them to (poorly) implement an XML parser by hand instead of simply telling us they didn't support XML.

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

Anything that's named "Specification By Example" frightens the shit out of me. I've been on the receiving end of too many "by example" spec that have gone horribly wrong. My most recent was an XML interchange format where the example was misinterpreted by the other team as an example text file with funny formatting, causing them to (poorly) implement an XML parser by hand instead of simply telling us they didn't support XML.

Of course any technique can be abused. But I would much rather have one authoritative source (a machine processable data source or MPDS) that can generate all of the artifacts (documents, automated tests, acceptance criteria, et. al.) than have to manually "parse" various documents, then again parse the results of the first pass, (much like the childrens game of telephone - where the end result is rarely the original starting condition).

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

Anything that's named "Specification By Example" frightens the shit out of me. I've been on the receiving end of too many "by example" spec that have gone horribly wrong. My most recent was an XML interchange format where the example was misinterpreted by the other team as an example text file with funny formatting, causing them to (poorly) implement an XML parser by hand instead of simply telling us they didn't support XML.

Of course any technique can be abused. But I would much rather have one authoritative source (a machine processable data source or MPDS) that can generate all of the artifacts (documents, automated tests, acceptance criteria, et. al.) than have to manually "parse" various documents, then again parse the results of the first pass, (much like the childrens game of telephone - where the end result is rarely the original starting condition).

One source for specifications is great. That source being a bunch of happy-path examples with no boundary conditions is a disaster. I wish every BA was an engineer and knew about degrees-of-freedom and the concepts of over-constrained and under-constrained systems.

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

Anything that's named "Specification By Example" frightens the shit out of me. I've been on the receiving end of too many "by example" spec that have gone horribly wrong. My most recent was an XML interchange format where the example was misinterpreted by the other team as an example text file with funny formatting, causing them to (poorly) implement an XML parser by hand instead of simply telling us they didn't support XML.

Of course any technique can be abused. But I would much rather have one authoritative source (a machine processable data source or MPDS) that can generate all of the artifacts (documents, automated tests, acceptance criteria, et. al.) than have to manually "parse" various documents, then again parse the results of the first pass, (much like the childrens game of telephone - where the end result is rarely the original starting condition).

One source for specifications is great. That source being a bunch of happy-path examples with no boundary conditions is a disaster. I wish every BA was an engineer and knew about degrees-of-freedom and the concepts of over-constrained and under-constrained systems.

I agree with your last part, and consider that an abuse of the technique. If you read the book, there are large sections dedicated to such things as ensuring boundary (and exclusion) cases are all covered in the "examples". In sohrt, one needs to covr the "sad" and "angry" paths as well....

The good news, is that looking at (properly calculated) cyclomatic complexity and code coverage (along with some mental logic) makes it fairly easy to discover what additional input conditions are needed. The appropriate person (usually product owner working with customer) can define what the expected outputs are. This new information then automatically flows through all of the artifacts.

Isn't the point of Fitnesse to let users/stakeholders run the integration/acceptance tests to verify that the program works as expected, without needing to know how to run tests from the IDE?

Apparently? It's yet another case of problem shifting. "If only we could get users/stakeholders to write their specifications in a highly formal language, we could then write a tool to parse and interpret that language to perform computer functions."

For a good read on the subject (that is purely conceptual, and deliberately does not involve specific tooling) read "Specification By Example".

Then file it under "ideas that sound good on paper if you don't think about it too much, but help consultants rack up tons of billables and ensures the entire application infrastructure falls apart after a few short years, but long enough so that you hire the same set of consultants to implement a new stupid idea."

You wouldn't think that category would be so plush with books/ideas, but it's actually the 3rd most popular. It's surpassed by "stupid shit invented by hipster developers to solve problems they don't actually understand in an effort to cure boredom at their day job because they're a bunch of fucking divas who refuse to solve actual business problems."

Isn't the point of Fitnesse to let users/stakeholders run the integration/acceptance tests to verify that the program works as expected, without needing to know how to run tests from the IDE?

Apparently? It's yet another case of problem shifting. "If only we could get users/stakeholders to write their specifications in a highly formal language, we could then write a tool to parse and interpret that language to perform computer functions."

Ye all may scoff. OVer the past 18 months using this approach on our projects, "Tree Swing" http://static.monolithic.com/pres/tree_swing/treeswing.jpg defects are down by over 60% and both requirements gathering/documenting and acceptance testing costs are each down by over 25%. Two clients who have adopted this approach are seeing trends with similar slopes (they have been doing it for a shorter period of time, so the aggregate benefit has not reached maximum).

Isn't the point of Fitnesse to let users/stakeholders run the integration/acceptance tests to verify that the program works as expected, without needing to know how to run tests from the IDE?

Apparently? It's yet another case of problem shifting. "If only we could get users/stakeholders to write their specifications in a highly formal language, we could then write a tool to parse and interpret that language to perform computer functions."

Hmm, what would we call it? Perhaps "programming"?

Fitnesse is meant for testing, not creating the app. The idea is you let the nontechnical users express what the app should do (the requirements) and then run it against the app's API to see pass/fail, so they know if it's doing what the requirements need it to do. It doesn't let them program, it lets them say "When I enter 2 for the first number, and 2 for the second number, and press the "Add" button, the result should be 4" and verify that the app does that without needing any special tools or having to know some kind of scripting/coding to provide those instructions.

Can we just agree, as an industry, that the only way to PROPERLY acceptance test a piece of software is to plop it down in front of one of their employees and ask, "hey can this software do everything you need? Is it acceptably fast and stable?" Let's just make that the fucking definition. And no it can't be "unit tested".

Goddamned this whole thing sounds like a scheme from some basement-dweller to avoid talking to another human being for more than 5 minutes. "It's time for the acceptance test, here's Sheri the records employee who will be using the--" "A GIRL! I don't know how to talk to a GIRL! Here have her fill out this 50,000 page form of unit tests first. A GIRL!"

"This software is beyond worthless, why does it overlay a picture of a giant orange... dildo?... when I'm trying to type in something?"

"Actually, that's a carrot, and it was in the requirements."

"Caret, you fuckwit. You know, that upside-down V symbol? Not the vegetable."

"Well... it's already passed acceptance testing, so enjoy!"

Alex,
If quickly a project is to be rolled out, get business user and developer in one single room and don't make them leave the room unless
a)one of them has killed the other.
b)both of them have agreed to requirement and project is complete.