A team is experiencing difficulty releasing software on a frequent basis (once every week). What follows is a typical release timeline:

During the iteration:

Developers work on stories on the backlog on short-lived (this is enthusiastically enforced) feature branches based on the master branch.

Developers frequently pull their feature branches into the integration branch, which is continually built and tested (as far as the test coverage goes) automatically.

The testers have the ability to auto-deploy integration to a staging environment and this occurs multiple times per week, enabling continual running of their test suites.

Every Monday:

there is a release planning meeting to determine which stories are "known good" (based on the testers' work), and hence will be in the release. If there is a known issue with a story, the source branch is pulled out of integration.

no new code (only bug fixes requested by the testers) may be pulled into integration on this Monday to ensure the testers have a stable codebase to cut a release from.

Every Tuesday:

The testers have tested the integration branch as much as they possibly can have given the time available and there are no known bugs so a release is cut and pushed out to the production nodes slowly.

This sounds OK in practise, but we have found that it is incredibly difficult to achieve. The team sees the following symptoms

"subtle" bugs are found on production that were not identified on the staging environment.

last minute hot-fixes continue into the Tuesday.

problems on the production environment require roll-backs which blocks continued development until a successful live deployment is achieved and the master branch can be updated (and hence branched from).

I think test coverage, code quality, ability to regression test quickly, last minute changes and environmental differences are at play here. Can anyone offer any advice regarding how best to achieve "continual" delivery?

3 Answers
3

"subtle" bugs are found on production that were not identified on the staging environment -- in one of the projects with such issues I've seen this was quite successfully addressed by tactic I'd call double-issues. I mean for bugs like that, guys created two tickets in issue tracker: one was assigned to developers to fix the code, another one to testers to design and establish regression test or change in staging environment that would prevent repeating it in future. That helped to keep staging close enough to prod.

problems on the production environment require roll-backs -- if these are frequent then your weekly releases are actually fake - consider adjusting the frequency to level that really works. By fake I mean that if say one of two your weekly releases rolls-back it means that users face new (working) release once in two weeks - which is all that counts, not the number of times you deploy.

enthusiastically enforced feature branches -- does that mean that some time before, you also tried working on single branch and found it inferior? If yes then skip the rest. Otherwise, try working on single branch (if needed, google for branching strategy "development branch" or branching strategy "unstable trunk" for details). Or, if you use Perforce, search web for Microsoft guidelines on branching and merging. Try did I say that? sorry appropriate word should be test: I mean, 1) plan for when and how to measure whether single branch is better or not than one you have now and 2) plan for when and how you will switch back to feature branches in case if this testing fails.

PS.

Probably you can find more tricks like that by searching the web for something like software projects risk management

update

<copy from comments>

I perceive frequent hot-fixes to be a symptom of a broken test pipeline - is this not the case? Either way, they require repeated releases to get the hot fixes out making more work for the ops team. In addition, hot fixes are usually coded under extreme time pressure, meaning they will likely be of lower quality than normal work.

</copy from comments>

last minute hot-fixes -- above concerns look reasonable to me, as well as your reference to broken test pipeline. With this update, your prior note that new code integration is blocked on Monday sounds like one more symptom of broken (I think more precise word would be contended) pipeline. By contention I mean the following: you use single branch to concurrently serve two purposes: integration and release. When release approaches, these two purposes begin clashing with each other, pushing for conflicting requirements: integration purpose is best served with continuously open branch (Merge Early And Often) while release stability benefits from branch being sealed (isolated) as long as possible. A-ha it looks like puzzle parts start getting matched...

..Look, that Monday-freeze now looks like a compromise done to serve conflicting purposes: developers suffer from block of new code integration while testers suffer from this block being too brief, everyone is somewhat unhappy but both purposes are served more or less.

You know, given above I think your best bet would be to try releasing from dedicated branch (other than integration). Whether this branch would be long lived like integration or short lived like your feature branches (with "feature" being, well, release) - it's up to you, it just has to be separate.

Just think of it. Currently you find one day is not enough to conveniently stabilize release, right? with new branching strategy, you can just fork 2 days before release instead of one, no problem. If you find that even two days is not enough, try forking 3 days before, etc. Thing is, you can isolate release branch as early as you want because this won't block merging new code to integration branch anymore. Note in this model, there is no need to freeze integration branch at all - your developers can continuously use it, on Monday, Tuesday, Friday, whatever.

The price you pay for this happiness is complication of hotfixes. These would have to be merges in two branches instead of one (release+integration). This is what you should focus on when testing new model. Track all that is related - extra effort you spend on merging to second branch, efforts related to risk that one might forget merging to second branch - everything related.

At the end of testing, just aggregate what you tracked and learn whether amount of this extra effort is acceptable or not. If it is acceptable, you're done. Otherwise, switch back to your current model, analyze what went wrong and start thinking on how else you can improve.

update2

<copy from comments>

My aim is to get stories tested and deliverable (behind or infront of a config wall) within an iteration, this can only be achieved if the testers are testing work performed in-iteration (and not stabilizing code from the previous iteration).

</copy from comments>

I see. Well I don't have direct experience with that way but have seen in-iteration kind testing done successfully in a project related to ours. Since our project was following the opposite way I also had a luxury of face-to-face comparison for these opposite approaches.

From my perspective, out-of-iteration testing approach looked superior in that race. Yeah their project went fine and their testers detected bugs faster than ours but somehow this didn't help. Our project went fine too, and somehow, we could afford shorter iterations than them, and we had less (much less) slipped releases than them, and there was less tension between dev and testers at our side.

BTW despite faster detection at their side, we managed to have about the same average bug life span (life span is time between introduction and fix, not between introduction and detection). Probably we even had a slight edge here since with shorter iterations and less slipped releases we could claim that on average our fixes reach users faster than their.

Summing up, I still believe that isolation of release codeline has better chances to improve your team productivity.

on a further thought...

isolation of release codeline has better chances -- upon re-reading I feel this might make an impression that I discourage you from trying in-iteration testing. I'd like to make it perfectly clear that I don't.

In your case in-iteration testing approach looks safe to try (er... test) because you seem to have clear understanding of how to achieve it (smooth test pipeline) and what are major obstacles. And after all, you always have an option to fall-back to alternative approach if you find it too hard to get that pipeline right.

BTW regarding obstacles, additional ones worth keeping track in that case will be issues like failure to reproduce bug at dev side and late to find / late to verify fix at testers side. These might stuck your pipeline too, like it happens now with hotfixes.

Thanks for your insights. Regarding branching we have tested a no. of approaches (and indeed I have used a number @ different orgs in my career). We have settled on a clean master that represents the code on production, an integration branch (based off master) that all developers pull into frequently (multiple times a day ideally). The integration branch is built & tested continuously, with frequent automated staging deployments. I have tried dirty mainline with great success before. Our current approach seems more controlled. We use configuration walls for incomplete, unwanted functionality.
–
BenAug 16 '11 at 8:42

1

@maple_shaft well first time I've seen bugs opened in tracker against test suite was in 2002 or 2003. And it seemed to be quite an established practice in the team I joined back then. As for bugs targeting diffs between prod and staging, these indeed seem novel to me since the first time I've seen it (and was really surprised) was less than 2 years ago
–
gnatAug 16 '11 at 11:24

1

@gnat, It seems like common sense which is why I wonder why I haven't heard of that before. Now that I think about it though it makes sense because every QA group I ever worked with seemed perfectly happy to dish out bugs but became whiny two year olds whenever bugs would be brought against them.
–
maple_shaft♦Aug 16 '11 at 11:28

1

@maple_shaft lol agree this way seems to be undeservedly rare. Did you know by the way that one can cast bugs not only on testers but on doc/spec writers too? - Build 12 of Dev Guide says "black" at page 34 line 5; should be "white". - Assigned to John Writer. - Fixed in build 67. - Fix verified in build 89 by Paul Tester.
–
gnatAug 16 '11 at 11:53

1

My last response as I don't want to turn into a chat session, but in my last org I wrote a bug against a spec writer and the entire division recoiled in a WTF moment. I was promptly told that I had an "attitude problem" and that I was not a "team player" and not to do it again.
–
maple_shaft♦Aug 16 '11 at 12:16

Without knowing the nature of the user stories and the number of them, I have to say that a 1-week release cycle seems extreme. The above scenario you described is intricately planned and involves a series of different branches, merge points, hand offs, environments, and test suites, more or less creating a human system where a single mistake amidst the complexity of the plan can cause late release or bad quality. This can have a domino effect on the subsequent releases.

IMHO the schedule is just too damn tight.

You can increase code coverage by writing more effective unit tests and also environment specific integration tests.

You can increase code quality by introducing pair programming and/or code review, although that eats up at even more precious time.

Better estimation of user story points can also help by implicitly limiting the user stories that go into a single release thereby lowering the denominator on your risk ratio.

On the whole, it sounds like you have good practices in place and you have a good system to handle your extreme release cycle. You seem to be on the right path.

@ZJR, please can you expand upon what you mean by extenuating in this context?
–
BenAug 16 '11 at 8:33

@Duncan, our iterations are 2-week, but we are trying single-week increments. This may or may not be possible/a bad idea. The thinking is that a single week increment will contain less new code and hence will contain fewer problems.
–
BenAug 16 '11 at 8:36