Basil Vandegriend: Professional Software Development » codinghttp://www.basilv.com/psd
Thu, 07 Apr 2016 02:02:00 +0000en-UShourly1http://wordpress.org/?v=3.9.14Running Java Unit and Integration Tests Separatelyhttp://www.basilv.com/psd/blog/2015/running-java-unit-and-integration-tests-separately
http://www.basilv.com/psd/blog/2015/running-java-unit-and-integration-tests-separately#commentsSun, 08 Nov 2015 15:02:19 +0000http://www.basilv.com/psd/?p=973Eclipse and Maven are not designed from the ground up to run automated integration tests separately from unit tests. This is a problem because integration tests typically take longer to run, so when coding, especially if doing test-driven development, there is the need to frequently run just the unit test suite from Eclipse.

Maven by its default convention expects all tests in a single test source directory (src/test) that are all run as a single group. Maven does support integration tests by the inclusion of the Failsafe plugin using the convention that such tests be named "**/*IT.java" or "**/IT*.java" and are located in the test source directory along with the unit tests.

This causes a problem within Eclipse, whose primitive JUnit test runner only allows running multiple tests based on a single project, package, or source folder. So there is no way to select unit tests and exclude integration tests within Eclipse based on Maven's conventions.

There are several options for resolving this difficulty which basically all involve changing the configuration of the tests. I recently set up a new Java project where I reevaluated these alternatives and discovered a new approach that I am quite happy with compared to the other options. The basic issue I have with changing test configuration is that it essentially imposes extra work on the creation of integration tests.

One common solution is to move all integration tests into a separate source folder (e.g. src/it). Eclipse can then run the desired set of tests based on the source folder. The problem is that you now need three directory trees for each tested package (one for src/main, one for src/test, and one for src/it), and you cannot easily get an overview of the tests for a single class/package due to the tests being split across two source folders.

An alternative solution is to mark each integration test class with a JUnit category and then configuring both the unit and integration run configurations in Eclipse to filter by this category. The problem I have with this is this reflects a redundant piece of information (since each integration test class name already specifies this) which you need to remember to do for each new integration test.

The new approach I found involves using the JUnit Toolbox library to define separate unit and integration test suite classes using a wildcard naming pattern to automatically filter integration tests by their name, just like Maven. Eclipse test runners can then be configured to run these individual test suite classes. The integration tests themselves do not need to be changed. An added benefit of this approach is that you can optionally choose to execute the tests in parallel, resulting in a faster runtime for the overall test suite (and potentially 'free' testing for concurrency issues :) In my simple project I observed the unit test suite runtime cut in half after switching to the parallel test runner.

Create a test suite class to run only unit tests. The code below is configured to run the tests in parallel: switch the ParallelSuite class to the WildcardPatternSuite class to run sequentially instead.

Create an Eclipse JUnit run configuration that runs as a single test the AllUnitTests class created in the prior step.

A similar test class can be created for running just the integration tests, if desired.

]]>http://www.basilv.com/psd/blog/2015/running-java-unit-and-integration-tests-separately/feed0Most Disturbing Codehttp://www.basilv.com/psd/blog/2011/most-disturbing-code
http://www.basilv.com/psd/blog/2011/most-disturbing-code#commentsMon, 27 Jun 2011 13:32:07 +0000http://www.basilv.com/psd/?p=662One question I often ask when giving job interviews is "What do you find most disturbing when reviewing code?" The answers I receive are especially interesting when compared to the interviewee's results doing an actual code review: it is rare for them to identify the problems they consider the most disturbing. This lack of congruence between what people say they do and what they actually do is not unusual - it is a common problem, for example, when using market focus groups.

This chain of thought then prompted me to ask myself this same question. What do I find most disturbing when reviewing code? My instinctive reaction was to answer "defects", but upon a little reflection I realized this was not true - I expect to find defects as a fundamental principle of quality. So it usually does not bother me to find a few during a review.

There are times when I am very disappointed when reviewing code - what am I finding at those times? Here are some specific occurrences:

Code riddled with defects reflecting a fundamental lack of understanding about the requirements.

Code very difficult to understand due to poor names and overly complicated logic that seemed repetitive or unnecessary.

A large amount of non-GUI code written without any supporting tests.

Code with inconsistent formatting and style.

What is the common theme here? After further reflection, I realized that the common element underlying these situations that I find most disturbing is a lack of professionalism / craftsmanship. This can manifest in a number of ways as indicated by the above list. The key criteria is whether a developer is helping achieve or is hindering our mission as software developers, based on what they produce for code. My evaluation of what is most disturbing is at its essence based on my core values and beliefs concerning software development.

]]>http://www.basilv.com/psd/blog/2011/most-disturbing-code/feed0Streaming Data to Reduce Memory Usagehttp://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage
http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage#commentsThu, 05 May 2011 13:01:52 +0000http://www.basilv.com/psd/?p=641I recently performed a series of optimizations to reduce an application's memory usage. After completing several of these I noticed that there was a common theme to many of my optimizations that I could explicitly apply to help identify further opportunities for improvement. As a reoccuring solution, this qualifies as a design pattern which I refer to as Streaming Data.

Context

This pattern applies when you need to process a significant volume of data but the processing can be done incrementally on small subsets of the data. A typical example is loading a list of entities and then iterating through the list to process each one. While the results (output) of processing can be combined across all the entities, it is important that the input to the processing only requires a small subset of all the data, and not the entire list of entities. A code example illustrating this problem context is shown below:

Solution

Reducing the memory usage in the above example is based on the observation that loading the entire list of objects to process can consume a large amount of memory and is not necessary since we only use one object at a time. So the solution is to stream - incrementally retrieve - these objects instead of loading them all at once. For the consumer of this data the only change required is to first obtain a reference to the stream such as an Iterable that incrementally fetches data. Updating our prior code example results in the following (changed lines shown in green background):

Examples

The mechanism to use for streaming objects will depend on the source of the data and may require significant changes compared to a bulk load. Here are some specific examples.

Parsing XML

Parsing XML files using JAXB is a convenient approach for converting the entire file into a tree of Java objects, but it populates the entire tree at once. To instead stream such data use the SAX parser provided as part of the JAXP API. The SAX parser is event-based, which means that it iterates over the entities (and attributes) of your XML and for each item invokes callbacks you define.

Querying Databases using Hibernate

When using Hibernate to query for a collection of entities it is convenient to simply ask Hibernate for the entire collection. A typical example of doing this using the query by criteria API within Hibernate is below:

When the criteria returns a large volume of data, however, this approach will consume a high volume of data. Instead use the scroll method on Criteria to return a ScrollableResults instance that can be used to iterate through the results. If you prefer to not expose the rest of the application to Hibernate classes, you can wrap the ScrollableResults in a special implementation of Iterator (which I leave as an exercise to the reader). The revision of the above example using streaming looks like the following (changed lines shown in green background):

This scroll approach only works when all the data can be processed within the same database transaction since the Hibernate session must remain open for the ScrollableResults to be able to continue fetching data. If this is not suitable then another option is to load the data using multiple queries that each return a subset of the data. One common example of this is when displaying search results to an user. Rather than showing all the results (which may number in the hundreds or thousands) show one page at a time and let the user step through the various pages of results. Due to the frequency with which this occurs I refer to this solution as paging. To implement this in Hibernate using the query by criteria API is fairly simple:

Start by creating your criteria object and defining its restrictions as you normally would.

Apply an ordering to the criteria. It is best if this ordering is consistent, by which I mean that database updates or inserts between queries will not result in invalid or unexpected results being returned. This assumes each query for a page executes in a separate database transaction which provides no guarantees of transactional isolation for the group of queries as a whole. In some contexts, consistency is not required. If it is then I prefer to use an auto-incrementing surogate primary key as the field to sort by in order to achieve the highest level of consistency.

Apply restrictions to retrieve only the specific page. This is done using the methods setFirstResult and setMaxResults on the Criteria object.

Consequences

One potential consequence of streaming data is a reduction in performance because data is loaded piece by piece rather than in bulk. To mitigate this, the solution is use what I call loading sets: define subsets of the total data volume that are small enough to not impact memory usage but large enough to minimize performance impacts. Then load the data one set at a time. The consuming API does not need to change: it can still iterate or stream over each loaded set, and then fetch the next set once the current one is exhausted.

]]>http://www.basilv.com/psd/blog/2011/streaming-data-to-reduce-memory-usage/feed0Exploring Mental Processes behind Developing Softwarehttp://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software
http://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software#commentsWed, 14 Jul 2010 14:00:35 +0000http://www.basilv.com/psd/?p=526How do you go about designing and coding software? More specifically, what is your mental process for accomplishing this? Becoming more aware of the approach you use allows you to deliberately control and improve it. Mental thought processes are, however, very intangible and difficult to put into words. In the software development literature much has been written about how to go about design and coding ranging from a naive object-oriented design approach (find the nouns in the requirements) to test-driven development (TDD). These approaches, however, deal with tangible actions to be performed rather than the thinking that must necessarily occur.

So I decided it would be an interesting challenge to write about my mental processes during design and coding that I hope will prove beneficial for you. I use a recent development task I worked on as a case study for examining my thinking. The task involved designing and coding a framework for generating a set of reports for an application. In my analysis I identify a number of mental 'stages' that I used, which together loosely comprise my mental process map. I first present a narrative of my thinking during this case study with references to these stages identified in bold and then provide some concluding thoughts.

Case Study Narrative

I start by uploading the problem domain into my working memory – reviewing all relevant requirements and any upfront or preexisting design. I am not trying to gain comprehensive knowledge at this point – there are many specific, incidental details that are not relevant to the big picture that I can safely ignore. I instead focus on the significant elements that will feed into my initial design work:

Domain Model: What concepts or data does the system need to manipulate or store?

Process Model: What operations or events occur? How do they make use of the data?

Constraints: What are the main constraints with respect to what I need to build? What do I need to consider or watch out for?

The initial upload raises a number of questions, points requiring clarification, and ideas for improvement (of the requirements and existing design). I consult with the business analyst and business team, using face-to-face conversation if possible, to gain clarity. Simultaneously I am thinking about the key concepts that I will use in the solution to address the required functionality. I use these concepts to assemble an initial domain model and process model – mostly in my head initially, perhaps supported by some sketches or doodles on a few pieces of paper.

As I iterate between understanding and clarifying the requirements and developing the concepts and models, I begin to balance competing requirements through a series of trade-offs. The most frequent trade-off is between minimizing design complexity and development cost versus fully providing the requested functionality. Having face-to-face meetings with the business helps me identify soft versus hard constraints and requirements. Soft ones can potentially be discarded through negotiation while hard ones are mandatory and must be addressed.

At this point I feel comfortable about certain parts of the design and feel fairly confident it will work, while for other parts I am still left with an uncomfortable feeling that further thinking does not help alleviate. This pushes me into an exploration mode in which I start writing code for the pieces I am uncertain about. I do not try to write complete, production-quality code. I instead do what I call 'design-level' coding where I define interfaces or classes with important method signatures, but with no real implementations for the methods – perhaps just some pseudo-code. At this point I find it hard to do TDD on non-trivial methods as method signatures and even the classes and interfaces can change dramatically. I leave lots of to-do comments in the code about specific questions or issues regarding specific functionality or design elements that are not relevant for the big picture I am working on. What I am looking for is significant gaps in my solution that may require additional clarification of requirements, or further refinement of the concepts and models I came up with earlier.

Throughout the entire process and especially during these initial stages there are times when I need a mental shift. This is usually when I am undecided how to resolve a particular design issue or when I feel mentally fatigued. I use a number of different strategies. One is to simply change location – get out of my cubicle and walk around. Another is to change activities to work on something unrelated and mentally less taxing in order to recharge. For thorny design issues I find that sleeping on them is a great way of letting the subconscious work on the problem and help arrive at a resolution.

At some point I feel that I have resolved all of the big uncertainties so I begin converting my separate chunks of design-level code into a unified set of working production-quality code. I call this consolidation. I usually begin by doing a sweep through the design-level code to resolve the outstanding to-dos. This helps identify any outstanding requirements clarifications that I need. I then switch to coding the functionality class by class and method by method using mostly strict TDD. Development feels slow at first because I need to write many utility methods or helper methods (for either the production code or for the test code), or refactor them into existence out of duplicate code that I introduce. But using TDD gives me that satisfying sense of progress as I slam out one fully-tested method after another.

Once the first draft of the code is written I polish it to ensure a high level of craftsmanship. This involves aspects such as renaming classes, methods and variables to ensure good readability, refactoring to eliminate duplication, and commenting when appropriate to ensure good maintainability. For more information on why and how to polish code see my article Why you should polish your code.

After reaching code-complete on the functionality I switch to feedback mode. I have two goals. The first goal is to pass the code through as many quality checks as I can to identify and eliminate defects. This includes asking for a peer code review, reviewing the results of static code analysis tools, verifying sufficient coverage by automated tests, adding automated integration tests, and performing manual functional testing. This list does not include automated unit testing because I have already done this concurrent with the coding. The second goal is to put the code to use to identify functional gaps, usability issues, and operational issues relating to non-functional attributes such as monitoring / logging, error handling, and performance. This second goal is especially relevant for infrastructure code, where putting the code to use generally means coding business functionality that exercises the infrastructure. Although I go into the feedback stage with usually ~95% code coverage from my automated unit tests, I do expect to discover and fix a few issues. As I progress through the stage, the code gradually stabilizes. At the end, it has reached feature-done status which means I consider it production-quality code ready for final testing.

Discussion

The process I have described may seem like it consists of discrete steps with a linear transition from start to finish, but it is anything but that. Each 'step' is fuzzy, blurring from one to the next. There are multiple transitions going back and forth between steps. Different portions of the functionality can simultaneously be in drastically different steps – I might be polishing one class while in the midst of consolidating a second and in exploration mode for a third. I like to characterize the actual process flow as chaordic - a blend of chaos and order based on balancing creativity with discipline.

The process may give an impression of a big-design-up-front approach, which is inaccurate for two reasons. First, I consider coding to be an act of design, so I am really designing throughout the entire process. Second, I do believe in doing an appropriate amount of thinking and analysis (what some call design) prior to starting coding. The amount needed depends on the size and complexity of the problem to be solved and my current understanding of it. For simple, straightforward problems I may only spend a few minutes doing this, but those few minutes will include upload, trade-off, and exploration activities prior to diving into the coding

In conversations with experienced developers I have noticed some correlations between how they describe the way they develop and my process, but there are also differences. So I am interested to hear what you think of this process and how it may match or differ from yours.

]]>http://www.basilv.com/psd/blog/2010/exploring-mental-processes-behind-developing-software/feed0Why Coding is not Enoughhttp://www.basilv.com/psd/blog/2010/why-coding-is-not-enough
http://www.basilv.com/psd/blog/2010/why-coding-is-not-enough#commentsMon, 28 Jun 2010 13:30:29 +0000http://www.basilv.com/psd/?p=522If the goal of software development is to produce working software then developers need to know more than just how to code - they need to know how to prevent or eliminate functional and non-functional defects.

Too many developers think their job is complete once a feature has been coded. Sometimes they think that it is the tester’s job to find defects. Sometimes they think defects in released code are unavoidable and normal, so not worth worrying about. Sometimes they believe their code is perfect - it cannot possibly have defects. I encounter developers with these attitudes with unfortunate frequency. I also encounter development managers who are surprised to encounter such attitudes. A while back I talked to one manager who was shocked to learn than one group of developers under her were assuming their code worked if it compiled successfully - there were no reviews or any sort of testing being done. So I hope with this article to raise the awareness amongst developers that coding is simply not enough to produce working software, and to raise the awareness amongst development managers that they need to ensure the appropriate systems are in place to support this.

The reality is that even the most diligent developers inject defects into their code at a surprisingly high rate. Defect rates are often defined as the ratio of the number of defects per one thousand lines of code (KLOC). Industry statistics on defect rates are rather hard to find and vary significantly, partly because the definition of defect used varies. Several studies have reported defect rates in the range of 10 to 100 defects per KLOC as reported in the book Best Kept Secrets of Peer Code Review. This works out to one defect per 10 to 100 lines of code.

On my most recent project I decided to calculate the defect rate for a particularly error-prone feature. Counting only defects found by independent testers after code reviews and unit testing were done, and using a KLOC count not including comments or blank lines, this feature had 20 defects for roughly 850 lines of code which is a defect rate of 24 defects per KLOC, or one defect for every 40 lines of code. This may seem reasonable, but remember that this is after multiple code reviews and automated unit testing have already found and eliminated a number of defects. (How many I do not know as these kinds of defects are not tracked.) And there still may be yet-to-be-found defects still lurking in this code. So the actual defect injection ratio is higher, perhaps much higher.

Defect rates have such a wide variance, even between developers working on the same code base, that it is unfortunately not a reliable metric for predicting defect counts. My main point in discussing them is to emphasize just how frequently defects are introduced.

Coding, therefore, is simply not enough. Every developer needs to have a personal system for preventing and eliminating defects, which should integrate into the system / processes used by the development team to produce high-quality working software. For ideas on how to assemble such a system check out my definition of done that identifies a number of defect elimination activities.

]]>http://www.basilv.com/psd/blog/2010/why-coding-is-not-enough/feed2Using To Do Comments in Codehttp://www.basilv.com/psd/blog/2010/using-to-do-comments-in-code
http://www.basilv.com/psd/blog/2010/using-to-do-comments-in-code#commentsTue, 16 Mar 2010 14:37:20 +0000http://www.basilv.com/psd/?p=498I am a big proponent of using to do comments – comments prefixed by a specific identifier such as "TODO" – in a code base to indicate outstanding tasks or issues with the code. I have encountered developers who are either unfamiliar with the practice or who do not follow it as deliberately as I do, so I thought I would explain my method.

The idea of to do comments really only makes sense with the proper tooling. The Eclipse integrated development environment calls them task tags, and supports providing any number of custom identifiers that when found in a comment will cause Eclipse to add that comment into its Task view. The default identifiers that Eclipse ships with include "FIXME" and "TODO". You can then browse the task view, sorting or filtering the tasks by various criteria, to see the outstanding work. Continuous integration servers such as Hudson running the Task Scanner plugin can produce statistics, reports, and graphs of the outstanding tasks in a code base.

Why use to do comments?

When doing my own coding, I use to do comments when ideas come to me regarding code I need to write or special cases I need to handle that are separate from what I am coding right now (usually in order to get the current test to pass, via test-driven development). Writing the idea down gets it out of my mind and out of my way. Using a to do tag reassures me that it will not be forgotten: part of my definition of done is to ensure that there are no to do comments remaining in the code.

When looking at code written by other team members, I want to be able to quickly tell what state the code is in – is it completely finished, or is it a work-in-progress, with scenarios or requirements left to be handled? The reason I care is that if I think the code is supposed to be finished, and see issues or outstanding work, then I will raise the issue(s) with the developer. Fairly often when I have done this in the past, the developer reassures me that they were already aware of the issue and would resolve it. That is usually when I recommend the use of to do tags as a communications mechanism to the rest of the team as to the status of the code, especially if someone else had to take over working on that code. And as often as not, unrecorded issues that developers say they are aware of and will resolve later end up being forgotten and left unresolved.

I hope I have convinced you of the value of using to do comments. Please leave a comment and let me know what you think about the practice.

]]>http://www.basilv.com/psd/blog/2010/using-to-do-comments-in-code/feed4Use Understood Methods Rulehttp://www.basilv.com/psd/blog/2009/use-understood-methods-rule
http://www.basilv.com/psd/blog/2009/use-understood-methods-rule#commentsMon, 14 Dec 2009 19:19:56 +0000http://www.basilv.com/psd/?p=467Over the years I have refined the approach I use to write code. Recently I codified a key aspect of this approach as a practice I call the Use Understood Methods Rule. The basic formulation of the rule is quite simple: when coding a method only invoke other methods whose behavior you clearly understand and are confident will work the way you want. This may sound overly simple or obvious, so let me elaborate further.

This rule is based on a bottom-up engineering philosophy: if you completely understand the methods you are invoking, then you should understand the behavior of the method you are coding and know that it will work. This applies recursively up the call stack to the top-level entry point of your application.

Key Requirements

My formulation of the rule above specifies two key requirements for using another method:

Behavior Understood: The behavior of a method can be defined in terms of its pre-conditions and post-conditions. Knowing the pre-conditions allows you to ensure you are correctly invoking the method, while knowing the post-conditions ensures that you will get the results you want.

Confident Will Work: This is arguably part of the prior requirement - understanding a method’s actual (rather than stated or expected) post-conditions - but it is so important I felt it should be explicitly stated. You need to verify that the method you are using will actually function correctly and not fail due to a defect. This verification is best done via automated tests, but there are scenarios I discuss below when another approach is needed.

Applying the Rule

Applying the Use Understood Methods Rule involves, therefore, satisfying these two requirements. Exactly how I do this varies based on the context – specifically the nature of the method I intend to use. Below I describe the common scenarios I encounter and the approach I use to apply the rule to each.

Method with implementation in code base: Calling a method that has an existing implementation within the code base you are working on is the most common scenario. This can be a method on the same class, on a super class, on a collaborating class, or a static method. This also can be an abstract method defined on an interface or abstract base class for which the implementing class exists and is known. To understand the method’s behavior I refer to the documented pre- and post- conditions if available, otherwise I look at the source code for the method. The correctness of the method should already be verified through automated tests. If necessary I can use code coverage analysis results to confirm that this method has sufficient test coverage.

Method to be written concurrently in code base: Often when coding a method, I find I have to create a new method, either on the same class or on a separate class. This might be a simple extract-method refactoring of logic to simplify the existing method, or it might be new functionality required on another class. In this scenario I have no issues understanding the behavior of the new method as I am writing it at the same time. If the new method is non-trivial then I ensure it works by creating unit tests for it separate from my tests for the original method I am working on.

Method with unknown implementation in code base: This applies to abstract methods in interfaces and abstract base classes for which no implementation yet exists or for which the implementation cannot be known in the context of the method being written. This latter scenario is typical when there are multiple implementations being processed in a common fashion. In this case I insist on having documented pre- and post- conditions for the method being invoked.

If the method implementation does not yet exist then it is obviously not possible to verify now that it is correct, but it is possible to take steps to gain confidence that the implementation will work once it is written. One option is to ensure that the automated tests for the method you are writing that invokes this not-yet-implemented method sufficiently exercises the functionality of this not-yet-implemented method to ensure it will meet your needs. Another more general option is to ensure that the team has processes in place such as test driven development and code reviews to ensure that code written in the future will indeed work.

Method in third-party code: When using third-party methods I usually rely on the documented API. When this is inadequate I look at the source code if it is available (this is a key benefit of open source libraries – the code is always available). In some cases the third-party software implements a specification (such as the various Java EE specs) and the specification can be used to understand what the third-party code is supposed to do.

Once third-party software has been selected for use within a project I generally assume it works. The verification of the quality of such software happens previously, in the selection process, when I do my due diligence to evaluate the quality of the software under consideration.

On rare occasions I write a unit test to understand and/or verify how some third-party functionality works. This is something I would probably benefit from doing more often.

Given the prevalent use of automated tests to verify correct behavior, you might be wondering how the application of this rule is impacted by the use of test driven development (TDD). The short answer is there is no impact: this rule applies the same whether or not TDD is being used. I do, however, slightly revise how I do TDD in order to apply this rule for the scenario Method to be written concurrently in code base - I create a second failing test before creating the new method. For further details see my article describing this refinement to TDD.

The Broader Context

Following the Use Understood Methods Rule is necessary for creating high quality code that you would trust your life to, but is not sufficient. Correctness also depends on satisfying class and application-wide invariants (such as properly closing database connections or limiting the number of file handles consumed concurrently), understanding the behavior of containers and frameworks (such as the Spring application context or Java EE container), and considering other quality attributes (such as security and scalability) which are and emergent in nature and not easily analyzed by looking at methods independently. I consider my rule to be the foundation on which the code is written, which these more global concepts rest on top of.

]]>http://www.basilv.com/psd/blog/2009/use-understood-methods-rule/feed3Test Driven Development – Benefits, Limitations, and Techniqueshttp://www.basilv.com/psd/blog/2009/test-driven-development-benefits-limitations-and-techniques
http://www.basilv.com/psd/blog/2009/test-driven-development-benefits-limitations-and-techniques#commentsTue, 01 Dec 2009 20:22:15 +0000http://www.basilv.com/psd/?p=463I wrote previously about the process I went through in adopting test driven development (TDD). In this article I discuss my experience with TDD: the benefits, the limitations, and the techniques I use when doing TDD.

Benefits

This section covers the benefits, as I see them, of doing TDD. This does not include the benefits of doing automated unit testing, which I am a big fan of and have been doing for years using a non-TDD approach (i.e. writing tests after writing production code).

Using TDD provides great code coverage, especially conditional code coverage. Strictly following Robert C. Martin’s three rules of TDD should result in 100% coverage for both statements and conditionals. I allow myself to deviate from these rules at times, but still obtain 90+% coverage.

Previously when writing tests after coding, as part of my process to ensure I was done a feature I would check the code coverage results to identify gaps and add missing tests. I have found that this code coverage check is mostly a formality when doing TDD.

TDD helps avoids the tedium that I have experienced at times writing the tests after coding. Often while doing TDD I am able to establish what I call the red-green rhythm: write a failing test (unit test result bar goes red), then get it to pass (bar changes to green). This rhythm makes it more enjoyable to write the tests, although the tedium is not completely eliminated. As a result, I find it takes less discipline to write the tests first than afterwards – I do not have to force myself to write a bunch of tests after getting some functionality in place.

When I started using TDD I was initially uncomfortable with writing the simplest possible code to get a failing test to pass (TDD rule #3) when I knew that the final production code would be different. As my experience using TDD increased, I began to see a number of benefits of following this rule:

Writing more code than the minimum necessary to pass the test runs the risk of having logic in the production code (statements or conditions) not covered by the tests.

At times, when the design for the method / class was a little fuzzy to me, writing the simplest possible code actually proved helpful, as I could then write another failing test which then clarified for me what the design would need to be.

Using TDD, especially when strictly following the three rules, seems to eliminate the question / debate about how much to test. When introducing unit testing to developers (not TDD, just the use of automated unit tests), I get asked this question time and time again. I have a standard answer I use, but it no longer seems relevant when doing strict TDD. In fact, the question itself no longer applies when doing TDD.

Limitations

When adopting a new practice it is important to know the contexts in which the practice is less applicable. I have come across a number of situations in which TDD seemed less helpful. These were the situations when I was most likely to deviate from the three rules of TDD or abandon TDD entirely.

I prefer to have the design of the method / class I am working on fairly clear in my head before I start writing tests. Sometimes I do this on paper, but sometimes I do this by working with the actual code which would be a deviation from TDD. Since adopting TDD I have tried to do this design in the test code instead of in production code, with mixed results. Sometimes this worked fine, and sometimes it felt unnatural and less productive. I have the feeling that as I gain experience with TDD I will grow more comfortable with doing design within test code.

When I need to code a non-trivial algorithm I often extract logic into separate methods on the same class or need to invoke new methods on collaborating classes. Often I need to write these additional methods as the minimum needed to get my failing unit test to pass, which means that I am strictly following TDD. The issue is that my normal unit testing practice is to test methods individually as much as possible, especially methods on other classes, and the rules of TDD do not require me to do so. In essence, my original failing unit test ends up being an integration-style test for the overall algorithm, while I want to have individual unit tests for methods making up the algorithm. So the limitation of TDD is not that it cannot be applied – I am using it – but that it is not enough to ensure what I consider to be a sufficient level of testing. See the Techniques section below for how I address this limitation.

In situations where automated unit tests are not applicable, TDD obviously does not apply. Some people would insist that everything be unit tested, and I agree that it is a goal to aspire to, but in some circumstances I feel that unit testing is not pragmatic. Situations where I am unlikely to use automated unit tests and hence TDD include:

Prototyping or other exploratory-style work such as an architectural spike. However, when trying to understand the behavior of third-party libraries, I often do find it helpful to do this via unit tests.

User interfaces such as web pages or emails. Automated tests can be used to verify that the web page or email content is produced without failure, but the actual content and formatting is best checked by a human.

Techniques

I have adopted several techniques for using TDD to fit my style of coding that go beyond the three rules. I prefer to think of them, however, as tips or tricks of the trade rather than firm rules.

I developed a technique to address the limitation of TDD discussed above regarding the creation of new methods on collaborating classes under a single failing unit test. When I go to create a new method on a different class, I recursively apply TDD. So before creating the new method, I create a new test for it on a different test class corresponding to this other class. This means that I now have two failing tests, not one, so I modify the TDD rules and just run the tests of this second class while working on this new method. Once I am done with the method, I can run the suite, confirm I have just the one original failing test, and resume working on the original method.

There are often times when, upon getting the current tests to pass, there are multiple scenarios to select from to write the next test. Which one to pick? My own preference is to choose a scenario that will fail given the current production code – do not choose a scenario that will automatically pass. The reason for this preference is that this helps maintain that rhythm of alternating between failing and passing tests. After all of these failure-inducing scenarios are tested, I do go back to add the scenarios I expect to pass. At this point, while I’m still technically following TDD, it feels like my old approach of writing the tests afterwards: the production code is finished, and I am testing the remaining scenarios to confirm it is correct.

Conclusion

Overall I found test driven development to be a very effective process for producing high-quality code, and I plan to continue to use it. I highly recommend every developer to experiment with adopting TDD and evaluate the benefits and limitations for themselves.

]]>http://www.basilv.com/psd/blog/2009/test-driven-development-benefits-limitations-and-techniques/feed1Adopting Test Driven Developmenthttp://www.basilv.com/psd/blog/2009/adopting-test-driven-development
http://www.basilv.com/psd/blog/2009/adopting-test-driven-development#commentsTue, 17 Nov 2009 23:22:28 +0000http://www.basilv.com/psd/?p=453I have always been keen on using automated unit tests since I first heard about them almost a decade ago. I have known about test driven development (TDD) for almost as long but the practice of writing tests first before writing production code never really clicked for me when I first tried it years ago. Since then I have evolved my approach of writing tests, but still almost always after I write the production code.

Based on these strong recommendations, I decided that I needed to give test driven development another try.

Preparation

I resolved to not just haphazardly try TDD like I did before, but to adopt it as a development practice for a period of time as a continuous improvement experiment. I deliberately went with a more disciplined approach. Based on my knowledge of personal development and continuous improvement, I knew that the change would be difficult, especially at first. So I prepared for the change via the following steps:

I reflected on the difficulties I would face in adopting TDD. I expected to struggle with two issues. The first was the drop in productivity due to getting familiar with doing TDD – I would be spending a greater percentage of my time thinking about my process of coding as it related to TDD, rather than the code I was writing. The second was the natural tendency to revert back to my established pattern of behavior. This reflection ensured I would have more realistic expectations when I started using TDD.

You are not allowed to write any production code unless it is to make a failing unit test pass.

You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.

You are not allowed to write any more production code than is sufficient to pass the one failing unit test.

I wrote a note to do TDD and stuck it to the front of my keyboard were it would always be visible to me. It served as both an affirmation and a reminder.

Having finished my preparation, it was time to actually start doing test-driven development.

Adoption

My initial start with TDD was easy: I started my next coding task by writing a test rather than production code. If only it stayed so simple :)

At first the shift in process was difficult as I had to consciously remember to write the test first, and then write only a portion of the production code necessary to get it to pass. My productivity felt a lot lower (I have no idea whether it was significantly worse or only a little). But I had expected this and used discipline to force myself to continue with TDD.

One of the hurdles I faced was how strictly to follow the three rules of TDD - particularly rule number three. I had always been uncomfortable with the idea of writing temporary or intermediate production code that would get the current test to pass, but that I knew was not the final form and that I would need to change. An example is implementing an algorithm to return a fixed value (say zero or null) rather than implement the actual logic. I decided to take a pragmatic approach and allow myself to deviate from the three rules on occasion, when following the rules seemed too onerous or difficult. I did not want blind adherence to the rules to cause me to completely give up on TDD. I expected that over time as my familiarity with TDD grew, I would be able to become stricter in adhering to the rules.

As expected I suffered setbacks along the way. I started a development task by writing most of a method of production code before realizing that I had no failing test. In another case I had a failing test but then churned out the entire production method, including all the special cases, well after that test would pass. I took these setbacks in stride – I considered them a normal outcome of adopting a new behavior, rather than personal failures, and simply returned to doing TDD once I became aware of my departure.

After about a week, doing TDD became less of a struggle. After about three weeks (the typical minimum duration to establish a new habit) TDD began to feel more natural. By this point I had clarified for myself the benefits and limitations of TDD and had integrated it into my development process. I have a lot more to say about this which I will save for a follow-up post. As a quick summary, I find TDD to be a valuable practice that I intend to continue to use.

Conclusion

If you have not tried TDD, I strongly recommend you experiment with adopting it as a development practice. Looking beyond just TDD, one of the points of this article is to encourage you to always be thinking about your capabilities as a software developer and continuously seek to improve.

]]>http://www.basilv.com/psd/blog/2009/adopting-test-driven-development/feed1My Definition of Donehttp://www.basilv.com/psd/blog/2009/my-definition-of-done
http://www.basilv.com/psd/blog/2009/my-definition-of-done#commentsMon, 26 Oct 2009 14:50:11 +0000http://www.basilv.com/psd/?p=450I recently wrote about why you need a definition of done, and it only seems logical to follow this up by presenting what I use for a definition of done for developing software.

I use two guiding principles as the basis for constructing my definition.

Potentially releasable: Ideally the software can be released (or shipped) once it is done. I've seen many people, particularly in the context of Scrum, use the similar term "potentially shippable".

These principles are deliberately idealistic in order to set high expectations and motivate continuous improvement when I fall short of reaching them.

Different definitions of done can be created based on different levels or scopes. The two primary scopes are:

Done for a feature / user story.

Done for a release.

For this article I am using my definition of done for developing a feature (user story).

My definition of done is essentially a checklist with items grouped into categories. The lists of items and categories are not meant to dictate the process by which these items are done or the chronological order. For example, automated unit testing is listed in a separate category from coding but it is typically done at the same time or before-hand, if doing test-driven development.

Without further ado, here is my definition of done.

Coding

Code meets functional requirements.

Code meets non-functional requirements. Typical ones include:

Performance (capacity, scalability)

Usability

Security

Maintainability

Code is deployment-ready. This means environment-specific settings are extracted from the code base. A past article I wrote on designing for deployability provides more context on this.

Code has been cleaned up. The goal is to ensure the code is easily readable and has a good design. In the past I have used the terms polishing code and refactoring to describe this. Robert C. Martin's book Clean Code: A Handbook of Agile Software Craftsmanship provides the best explanation of this that I have seen. Achieving this goes a long way towards meeting the maintainability requirement.

All errors and warnings found by the tools have either been corrected or have been suppressed with a comment indicating the reason for suppression.

Testing

Automated unit tests are written. The tests should be high quality (e.g. not brittle).

Automated integration tests are written that verify interactions with external systems such as the application database or third-party application / web services.

Code coverage achieved by the automated tests is measured and sufficient coverage is achieved. I use Cobertura for measuring code coverage. I do not like using only a numeric percentage coverage target as the sole definition of sufficient coverage, because this can encourage people to write poor-quality tests that merely execute the code rather than verify its correct behavior in order to meet the target. My true definition of sufficient coverage is that the tests execute and verify all code that could reasonably be incorrect. Having said this, I generally aim for at least 80% line (statement) coverage overall, and often achieve 90%+ coverage for individual classes. I am still debating what a reasonable target is for branch (conditional) coverage . I currently aim for at least 50% overall, but I have the feeling that a target of 75% would be better.

Functional testing by someone other than the developer has been done. Ideally this testing will be done by the customer, involve exercising the complete feature being coded in the way that users would use it, and be fully automated. More frequently I have seen this testing done manually (especially for user interfaces) by business analysts or testers who act as proxies for the customer. The key idea is to have someone other than the developer do testing to validate the assumptions and interpretations made (often implicitly) by the developer.

I use the vague term "functional testing" rather than the more common terms "system testing" or "user acceptance testing" because projects can differ dramatically in what is done for system or user acceptance testing. If acceptance testing is done in a waterfall fashion as a separate phase near the end of the project then it cannot be part of the definition of done for a feature (but it is still part of the definition of done for the release). So I use the term "functional testing" to indicate this potential differentiation. Ideally, based on lean principles, all testing including system and user acceptance testing should be done as part of the work on a feature and not artificially delayed till later.

Reviewed

Design / approach has been reviewed by the technical lead / architect.

Detailed peer review / inspection has been done. If pair programming is being used then the peer review is automatically done at the same time as the coding. Otherwise, the reviewer should focus on issues that are less likely to be found by the static code analysis or automated testing. This can include items such as security holes, concurrency issues, and correctly meeting requirements.

Issues identified by reviewers have been resolved to the reviewer's satisfaction.

Other

Required documentation has been updated. This may include online help, user manual, or operations manual.

Build and deployment scripts and related configuration files have been updated.

No known defects are outstanding unless the customer has agreed to defer them, in which case they should be logged.

No known tasks related to the feature are outstanding.

That concludes my definition of done. I would appreciate hearing about what you use for a definition of done. In particular, if there is anything you think should be added or removed from my definition please let me know via a comment below.