BDD is slightly different from other test methodologies in that it’s designed to be used in cross-functional teams. In this post I will briefly touch on these differences, and then proceed to explain how you would change your approach to writing test code in accordance with the BDD philosophy with the help of an example.

The target audience of this blog post is test engineers first and product managers second. Note that I use these terms as roles rather than job descriptions; a test engineer is anyone writing test code, and a product manager is anyone thinking up features for the software. You could be both of them at once.

Specifications

In an idealized software development life cycle, where different phases of software development are clearly separated from each other, you would start with gathering requirements for your software product. Those requirements would be compiled into a functional specification that describes how the software should behave. The functional specification gets transformed into a technical specification that adds technical detail, such as what software components should exist, and how they should interact.

But some of that detail is derived directly from the functional specification. I will be using the example of a website that exposes it’s functionality to users only after login. What content the website contains does not matter for the purposes of this example, we will just be focusing on the login functionality.

In the functional specification you might have a sentence such as:

Users shall log in to the website with their email address and password.

That’s a fairly high-level description of the login feature, and it might be followed by a more detailed description of how the login form should be presented, what users should see after successful or failed login attempts.

As a developer, whether you are to implement the website code or test code, that little detail opens more questions than it answers. How many characters should passwords have? Do we verify that user actually specify an email address, or do we just assume they do? Questions like that are answered in the technical specification:

Data entered into the email field of the login form is verified to be a valid email address according to RFC 5322 section 3.4.1 and related.

Data entered into the password field of the login form is verified to adhere to the following criteria: The minimum acceptable password length is 8 characters. Passwords must contain at least one punctuation mark, one upper-case letter, one lower-case letter and one digit.

Of course, some of those details may already be given in the functional specification. But it’s highly unlikely that you would see a reference to RFC 5322 in the functional specification — that’s the amount of detail you’d see in technical specs.

Example 1: Simple Test

Whether you write test code or implementation code, you’ll start with some form of technical specs. It’s true that agile development practices seem to discourage the use of technical specifications, but to an extent that is an illusion: the technical specs exist, but not necessarily in a formalized document. They might exist simply in the form of test code.

So let’s write a simple test that would verify the login feature above. I’ll not provide exhaustive test code here — just enough to give you an idea of the things that might happen. For this test code I am going to use Ruby and Selenium webdriver. Ruby is the language in which much of cucumber is implemented. Selenium offers a convenient way to drive browers, whether from within cucumber or from plain ruby code. Using both should be useful if you want to experiment with the latter cucumber examples.

With that said, our test code should do the following:

Request the login page

Add an email to the email field

Add a password to the password field

Fire off the login button

Look for occurrences of success or failure messages on the resulting page

The above code snippet already applies some reusability principles by wrapping test code into a function, and executing the same test with several different inputs, some of which you would expect to fail.

The snippet should also illustrate that none but a developer would really understand the test code. To developers, the last three lines document that the first test case is expected to succeed, whereas the other two are expected to fail.

To put that into the context of the test code example above, BDD tries to bridge the gap between test code and prose specifications by introducing a formal language in which test cases and expected outcomes are to be written. The compromise is that when writing specifications, a little formalism is required, without going into full-blown technical specifications.

Example 2: Gherkin

In cucumber, that language is called gherkin (what else?). If I were to write a gherkin specification for the login feature, that might end up looking something like this:

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario: valid credentials
Given I am on the login page
When I provide the email address "test@mywebsite.com"
And I provide the password "Foo!bar1"
Then I should be successfully logged in
Scenario Outline: invalid credentials
Given I am on the login page
When I provide the email address <email>
And I provide the password <password>
Then I should not be logged in
Examples
| email | password |
| test@mywebsite.com | Foo!bar2 |
| test@mywebsite.com | Foo |

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario: valid credentials
Given I am on the login page
When I provide the email address "test@mywebsite.com"
And I provide the password "Foo!bar1"
Then I should be successfully logged in
Scenario Outline: invalid credentials
Given I am on the login page
When I provide the email address <email>
And I provide the password <password>
Then I should not be logged in
Examples
| email | password |
| test@mywebsite.com | Foo!bar2 |
| test@mywebsite.com | Foo |

Now I don’t want this post to turn into a cucumber tutorial, there are plenty out there already. There is one really important bit to take from the above example: cucumber allows you to hook up the above to some executable Ruby code.

If you look at step definitions such as:

When I provide the email address <email>

When I provide the email address <email>

It should become obvious that this matches the following Ruby snippet from the previous example incredibly well:

In other words, the aspect of the test code that documents what is to be tested has been made human-readable. The aspects that determine how that test is to be performed remain test code.

Cucumber for Cross-Functional Teams

The real power in making test scenarios human-readable is that it becomes reasonable to assume that a person in a product management role can write specifications in the way developers need, and that people in an engineering role can take these specifications and start writing test and implementation code.

The need to translate from functional specifications to technical specifications to code is, to a degree, done away with. It’s replaced by a “compromise” language; one that requires just enough formality to work for semi-technical specifications, but has just enough leeway to allow non-technical people to communicate in the language they understand.

In addition, cucumber provides just enough “magic” to make these human-readable test scenarios executable, thereby avoiding the eternal problem of one lagging behind the other. More specifically, by coupling documentation (feature and scenario descriptions) and code very tightly, the code is made to fail if the documentation and code diverge.

It should be noted that the idea of coupling documentation and code more tightly is by no means new. Donald Knuth introduced the idea of literate programming in the 1970s, albeit without imposing any formalism on the documentation language. The result was that the documentation was just noise to the computer, to be ignored, which prevented the computer from ensuring that documentation and code do not diverge.

Example 3: Too Much Engineering

Because gherkin can be seen as a compromise language, there is room for pitfalls when describing test scenarios. With an engineering mentality, the above gherkin example could be shortened considerably like so:

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario Outline: various credentials
Given I am on the login page
When I provide the email address <email>
And I provide the password <password>
Then the element with id "success" <should_appear>
Examples
| email | password | should_appear |
| test@mywebsite.com | Foo!bar1 | true |
| test@mywebsite.com | Foo!bar2 | false |
| test@mywebsite.com | Foo | false |

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario Outline: various credentials
Given I am on the login page
When I provide the email address <email>
And I provide the password <password>
Then the element with id "success" <should_appear>
Examples
| email | password | should_appear |
| test@mywebsite.com | Foo!bar1 | true |
| test@mywebsite.com | Foo!bar2 | false |
| test@mywebsite.com | Foo | false |

In terms of brevity, this clearly trumps the previous example. However, it introduces a level of technical detail into the specifications that really should not appear there. By phrasing the last step not as success or failure, but in terms of whether or not a page element is visible, you would make the following mistakes:

You specify implementation details in the form of the element name. Now you cannot change the name of the element without also changing the specification.

You bind yourself to testing a web page. It might be that the customer requirements change; that instead of a website they want a desktop application. By phrasing success and failure in terms of web page elements appearing or disappearing, you will need to revisit the specification if requirements change in this manner.

You also introduce implementation language-specific elements into the documentation. You might decide to switch the language used for writing test code from Ruby to Python or vice versa, where boolean data types might be defined differently.

In other words, you not only require more technical knowledge from the product managers, you also make it harder to change implementation details. Neither is desirable.

Example 4: Too Little Structure

On the other end of the spectrum is the pitfall that your test scenarios might contain too little structure to be implemented easily in test code. Consider this gherkin defintion:

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario Outline: invalid credentials
When I try to log in as <email> with <password> that should fail
Examples
| email | password |
| test@mywebsite.com | Foo!bar2 |
| test@mywebsite.com | Foo |

Feature: Login
As a website user
In order to access the website content
I need to log in to the website
Scenario Outline: invalid credentials
When I try to log in as <email> with <password> that should fail
Examples
| email | password |
| test@mywebsite.com | Foo!bar2 |
| test@mywebsite.com | Foo |

It’s not as if the above could not be implemented in test code. Of course it can.

But there is so little structure in this description that it leaves questions open. Should a log in form exist on every page, or is there a specific login page? Should both email and password be provided in the same form, or is this an operation with several steps? What is failure, exactly?

It doesn’t take a genius to fill in the details that would answer those questions. Either the (test) engineer or the product manager could. But in this case, the login example is flawed because it is just too simple to reflect real test cases very well.

Writing software is a fairly complex task, and requires very structured thinking. While engineers tend to be good at structured thinking, mostly because they practice it a lot, they might be a little too good at it… and could easily make assumptions about how your software should work that aren’t shared by the intended users of the software.

There is real value in letting a non-technical person describe how software should work in non-technical terms, but with just enough semi-technical detail that engineers can’t run off and build rocket engines when a bicycle would have sufficed. So break down your test scenarios into smaller steps, that engineers have to follow. Doing so also exposes a little how much thought has to be put into designing software behaviour, and can catch mistakes in that area early.

Conclusion

Behaviour driven development in general, and cucumber in particular are, any agile software philosophies aside, excellent tools. They help balance how much the behaviour of software must be formalized before development can begin, thereby avoiding both waterfall models of software development, and the complete lack of any formal specification that all too eager agile proponents might arrive at.

They can also help as an aid to communcations. Gherkin is a language that all members of a cross-functional software development team can learn and master in a short amount of time.

Formalizing specifications in the amount of detail required by gherkin also gives both test engineers and implementation engineers just enough information to get started, allowing for some parallelization of these tasks.

I would recommend approaching your spec from the perspective of user flow and writing acceptance tests instead of a simple verbal spec. Having been on both the design/product management and the development side of the spectrum, I’ve become a huge fan …