Exit Criteria, Software Quality, and Gut Feelings

Bug counts and trends don't cover all the quality aspects of a product. A good exit criteria list provides an orderly list of attributes that research and experience showed to have impact on product quality, so you can monitor the product quality at any given time and forecast the expected status at release. That's how you improve your product.

A few months ago I happened to meet an old friend who is also a software tester at a social gathering. Soon enough we started to talk shop, and my friend shared his bleak situation.

“Tomorrow we have a project meeting to approve the release of our latest version,” he said. “I am going to recommend we don’t release it, but I know I will be ignored.”

“Why this recommendation?”

“Because I have a strong gut feeling that the last version we got—the one about to be released to our customers—is completely messed up. But the product manager just follows our exit criteria list, which we meet, and ‘gut feeling’ is not considered to be solid data.”

I inquired about the source of this gut feeling and got the whole story: Two weeks ago his team received a build for a last test cycle before the official release. Theoretically, everything was supposed to be perfect. But on the first day of testing, the testers found two critical bugs.

The development team rushed to fix them and released a new build. However, for two weeks, the cycle repeated: The testers would find one or two critical bugs and the developers would release a new build with fixes. This created the problem my friend had.

“By our exit criteria, we can’t release a version that has critical bugs,” he explained. “But the development team fixed every bug we found, so at least for this moment in time, there are no open critical bugs. In the last two weeks I got eight new builds. Do you think we got even close to testing the product properly? Based on the quality level of builds in the last two weeks, I can guarantee that the build we got today contains a good number of additional critical bugs. If I get a few more days to test it, we will find them, but release day is tomorrow, and on paper everything is fine.”

What can we do to deal with such situations? How do we translate our professional intuition, our gut feeling, into hard data?

Creating Comprehensive Exit Criteria

The ideal way to communicate these feelings is to be proactive. You need exit criteria that are a bit more sophisticated than a simple bug count and bug severity. One has to add trends: the rate of incoming new bugs, the rate of bugs being closed, and the predicted bug count based on the number of yet-to-be-executed tests.

Here are some examples for these criteria:

The number of critical bugs opened in the last test cycle is less than w

The count of new bugs found in the last few weeks is trending down at a rate of x bugs/week

Testing was executed for at least y days on the last version without any new critical bugs and not more than znew high-severity bugs

Note, however, that bug counts and trends are not covering all the quality aspects of a product. An exit criteria list that tracks only bug statistics does not guarantee much; you can meet all the bug-related criteria and still have a bad product. Your requirements, product manuals, support readiness, hardware quality, etc., may not be as good as they should be. Additionally, the limits you put in the exit criteria (e.g., maximum open bugs) may be too relaxed; it takes some experience to know how to set them for a specific product line.

A good exit criteria list provides an orderly list of attributes that research and experience showed to have impact on product quality. Improving the results on these attributes is known to have a positive impact on product quality.

Monitoring Quality and Forecasting Status

A list of exit criteria can be used as a simple go/no-go decision tool when a release date arrives. However, it can be used much more effectively if it is referred to all along the product development timeline.

For example, let’s say you have a criterion of “No more than one hundred open bugs with medium or low severity.” Assume you also have the following information (you can get some of it from the bug database, and other pieces come from the development and test teams):

Current open bugs: 300

Time to release: 10 weeks

Projected new bug submissions in the next 10 weeks: 100

Number of engineers assigned to fix bugs: 2

Average time to fix a bug: 0.5 days (so in the next 10 weeks, two engineers can fix 200 bugs)

This means that by the current estimation, which is done ten weeks ahead of release time (!), you can proactively raise a red flag and inform the development manager that they are at risk of not meeting a specific exit criterion. They need to assign more people to work on bug fixes.

While the example can be seen as fabricated (well … it is), it does convey an important message: Exit criteria are a useful tool to monitor the product quality at any given time and to forecast the expected status at release date. When used properly and consulted throughout the development time, these criteria can help avoid at least some of the firefighting we all experience just before a release date.

Many of the criteria in the list I propose are simple to track and do not rely heavily on estimations as the given example. If your exit criteria require that all the committed design documents are published, reviewed, and updated, the simple act of collecting the actual data will give an early warning when too many documents are not completed.

Getting Your Team On Board with Exit Criteria

You can’t just set the exit criteria or their limits without wide agreement. Any team that will need to deliver in order to meet the criteria must agree that the criteria are worthwhile and the limits are fair and achievable.

You also must set the criteria as early as possible in the project timeline. At that time all the stakeholders are optimistic, full of good intentions about quality, and not under pressure. They are in a mindset that allows discussing the criteria and limits in an objective manner. If you try to agree on criteria and limits a few weeks before release time, the limits that the development team will agree to will be significantly influenced by the current state of affairs rather than what good quality is.

Back to my friend.

Adding exit criteria ahead of time would have revealed a clear stop sign to the project managers, and my friend could have enjoyed the party. But what can be done in a case similar to the one described here, where such criteria were not established in advance?

I suggested to my friend that he “translate” his gut feeling to hard data and present it in a clear manner: “This is the current status: There are too many new critical bugs in the code and not enough time for proper testing. Based on these data, I conclude that the version quality is not fit for release.” These aren’t feelings or intuitions, but a professional opinion based on facts, experience, and in-depth knowledge of development processes.

I called a few weeks later to hear how it went. My friend had presented graphs:

The number of new critical bugs on a daily basis (one or two per day; no improvement trend)

The time the testers had to test each new release in the last two weeks (one or two days)

The percent of tests that were executed on the last version (about 25 percent of the planned tests)

This was enough to give everyone a bad gut feeling, and the release was delayed.

The moral of the story? It’s possible to translate gut feelings about quality to hard data. However, as a rule you want to avoid such situation altogether. The early definition of exit criteria, combined with data, lets you monitor quality throughout the project’s lifetime and avoid potential last-minute disasters.

User Comments

The problem with coverage data is that it ignores the question "What if I go live?" Testers caught up in trying to prove, as your friends team required, that you have less than x number or critical defects, ignore the perils of going live. We have gone live with critical defecrs, but always with a thorough analysis of which part of the business would be impacted, what risks we faced and whether it was well worth it given the business imperatives.

I don't think there is a contradition. A possible exit criteria for you would be "Analysis of possible impact to the business for each of the critical bugs was done"; "A risk analysis of all critical bugs was done".

Exit criteria are there to make sure you go through the process your organization thinks is needed to take place, before a milestone can be announced. The details will vary between organizations.

The problem with coverage data is that it ignores the question "What if I go live?" Testers caught up in trying to prove, as your friends team required, that you have less than x number or critical defects, ignore the perils of going live. We have gone live with critical defecrs, but always with a thorough analysis of which part of the business would be impacted, what risks we faced and whether it was well worth it given the business imperatives.

Interesting that a quality problem has been percieved as a test problem and not a whole team problem, and that the fix has to come from test not the whole team. There's a far bigger problem than exit criteria to be resolved here and putting in place 'improved' exit criteria, at whatever stage, just ain't gonna fix it.

I agree that if exit criteria (putting them in place; reviewing them) is considered to be solely Test responsibility, it won't lead to a marked improvemnt in the product; it may provide a better release control, but indeed you don't want to even get to this point.

As I recommend, defining the exit criteria is a cooperative effort that should take place early in the product life cycle, with all the involved stakeholders. Tracking them should be the ownership of a specific person (e.g. product manager) - becasue spread ownership does not generally work in my opinion. Taking measures to meet them is again a cooperative effort.

In the case of my friend, I agree that the fact the program manager only considered the "dry" numbers and did not read further into the larger picture is hinting to a bigger quality problem.

Planning for regression time is fine - but in this case, regression failed and failed again.... even if you do plan for regression time (as was the case here), there is a limit to how much time you'd assign to it. Jut before release time, you assume things will be pretty much healthy.

(but all this is beside the point; the story is just a way to show how Exit Criteria should / could work for you).

Regression is important, but why does it always happen at the end of a cycle just weeks before release? There is not much time left to uncover and fix issues. Regression tests are in my opinion a subset of the regular tests and regression tests are most effective if they are fully automated and can run autonomously. That way regression can happen every week or even every day depending on how many resources are available for automation.

Things being healthy just before release time is also not a given. Feature development is pushed to the end ignoring that time is needed to properly test and fix. The only way out of this is to eliminate set release dates. Setting release to a date is artificial time boxing. Sometimes things take a few days longer. It will also benefit the customer because if features are done they do not have to wait until release date to get the features. Continuous delivery in unison of continuous regression will generate more value and also take the pressure of teams. The business and customers will get features when they are done and working, not on a date that was selected without quality in mind. I think everyone is served better when we take that extra week to get it right the first time rather than constantly chase preventable issues in production.

Pages

About the author

Michael Stahl is a SW Validation Architect at Intel. In this role, he defines testing strategies and work methodologies for test teams, and sometimes even gets to test something himself - which he enjoys most. Michael presented papers in SIGiST Israel, STARWest, EuroStar and other international conferences, and is teaching SW Testing in the Hebrew University in Jerusalem. Before starting his career in testing in 2000, Michael worked at Intel’s manufacturing facility in Jerusalem, Israel, as a chip-level test engineer.

Michael is an executive board member of the Israeli Test Certification Board (ITCB), holds a full Advanced ISTQB Certification, and chairs ITCB’s advisory board. Some of Michael's presentations and papers are available on www.testprincipia.com.