I'm not sure if this question is a duplicate but couldn't find anything of the sort even though it seems to be a common issue, so if it is a duplicate, excuse me for my inefficiency at searching keywords outside my knowledge.

I'm a software developer in a small team and I'm trying to help my Project Manager review our deploy process to make it better.

Our product is made of multiple components with different update cycles, hard to manage and where features often have hidden side effects in other parts of the code. After features are developed, they are left untouched until tested either by our QA in full regression tests that can last a week or more, or by our customer herself. Months can pass until a feature is ready to be deployed. We can have many features "on hold" at the same time, at different development or testing stages, and the biggest features can wait even years of refinement before being deployed.

Of course, when features are deployed they are merged together and the mixture can often be... explosive. This leads to bugs in the customer's production environment, which in turn raises suspicion about quality of code and incites "more testing time", which actually means more waiting time and more features to be deployed next...

Continuous Deployment seems to be unfeasible because our customer (with good reason) wants to oversee deploys and her deploy times are long. How can we reduce the risk of interferences and side-effects from multiple merges? I think this is too basic an issue not to have been addressed before.

What kind of tests do you have in place? Is everything manual? Unit tests? Integration tests that test the combinations of modules?
– RubberDuckApr 17 '18 at 1:02

We have some unit test, but the coverage may be between 20 and 40%. What is seen as "true" testing is manual regression test, which as I said takes up to a week. We also have integration tests, but these are limited and often the mocks are not 100% "faithful" to the actual behaviour of the mocked component (yes, it is a messy situation)
– phagioApr 17 '18 at 8:11

2

Fix that. Work around the time delay issue until you can build up your testing strategy. This is about trust right now. You need to get to a place where you’re confident in the current version of the code before you have any hope of gaining your client’s confidence. Build your test suite and eventually there won’t be a need for that week long manual testing session.
– RubberDuckApr 17 '18 at 10:50

3 Answers
3

If I understand you correctly, you have only one team and the only/main reason why you operate this way is because QA takes so long and you don't want to "twiddle your thumbs" while waiting for their feedback.

If that is the case then your first priority is investing heavily in automating as many of your tests as possible. You might not be able to get rid of the week long manual testing. Some companies are quite anal about that (cough GMP cough). But your problem is not that there is this weeklong voodoo ritual in your deploy pipeline. Your problem is that currently you get most of your bug feed back from it.

Especially with just one team working on a product you'll want continuous integration. You want to hear about your problems as quickly as possible. And that means automated tests. You want to stamp out a feature, polish it until everything works and then be able to forget about it.

This means you need to write automated tests for the bugs your testers find. What is killing you now is having to wait for those "explosions" 2 months down the line when two features are merged. If you had the ability to mix those two ingredients as soon as both are finished and get immediate results then you could start fixing immediately too.

Even if you have multiple teams, if you are able to merge and "pre-test" your feature with the most recent fully tested version you should encounter far fewer situation where you need to revisit a feature after the "real testing" is done.

Once you can stamp out features that are essentially iron clad it no longer matters if the QA needs another week to bless your new version or not.

I guess this also means strong domain knowledge expectations, but it seems to me this is probably the right way to go. Having a feature fully documented in unit and integration tests has its fair share of advantages, more so if the unit test suite is constantly integrated with (fixed) bugs and edge cases. The downside is that the customer should be educated in the importance of automated testing and it would take a long time to achieve even 80% code coverage... but it's really just a matter of priorities, probably.
– phagioApr 18 '18 at 7:35

@phagio I don't understand. The automated tests should have nothing to do with the customer.
– KempethApr 18 '18 at 9:53

@phagio, the customer doesn't have to be educated in the importance of automated testing, but in the fact that you are improving your quality processes and that feature delivery might take a bit more time. And it would be good to involve QA here as well (at least those people that write/specify the tests).
– Bart van Ingen SchenauApr 18 '18 at 11:05

@Kempeth They have nothing to do with the customer, but help in dealing with the delay as the models proposed by Eva (specifically model II) as reaction time (prolonged by full regression tests) is much shorter and one can afford leaving branches unattended.
– phagioApr 19 '18 at 10:07

@BartvanIngenSchenau Yes, this is more "business-y" than I can handle right now, because I guess customer management involves also understanding her priorities and giving her solid reasons as why spending X time and money now will save up the same amount and more in the long term.
– phagioApr 19 '18 at 10:11

I see two ways out that may help. Let's say you have 4 features(A, B, C, D) planned and the ETA for them is:

A May-01

B May-15

C May-25

D Jun-01

Each feature is developed in its own branch. Each branch starts from master. Production server has the code of your master branch.

I. Plan builds in advance

Your customer decides that she wants to have features A and C in the next release. So once C is ready, on May-25 you merge and deploy A and C to the build server and QA tests the build. After the build is tested, bugfixed and reviewed by the customer it goes live. A and C are now in the master branch. The production server has the same code as the build server had. So you won't have any surprises.

II. Plan the builds from what's ready

Once each feature is finished, it's tested at its own feature server. A is tested and ready to go live. B is tested, needs some bugfixes or goes to the kingdom of 'on hold'. C is tested and ready. D is in progress. The customer makes a decision whether she's okay to push A and C live or she wants to wait for D. If she's okay to deploy A and C then you also merge them and deploy to some pre-release server where QA runs smoke tests or checks the basic functionality and the customer reviews the release. And then you go live, also without surprises created by merges.

The second scheme is more flexible as you don't block QA and release schedule.

Hi Eva, thanks for the answer. The second solution seems indeed to be more viable, but it has the downside of slowing release process. Imagine that the customer decides one day to deploy A and C the next week. The merge and deploy to pre-release environment can take at most a day, but then thorough tests would take time and push deploy date further away from the expected date (and more so if we want to stay on the safe side and deploy in the first half of the week). Well, you can't have it both ways.
– phagioApr 17 '18 at 8:44

the biggest features can wait even years of refinement before being deployed

Months can pass until a feature is ready to be deployed.

It appears there are a lot of technical and process inefficiencies beyond deployment management. Although analyzing and improving the deployment process might provide some benefit, addressing other shortfalls may be more valuable. Look into Lean Software waste (TIM WOOD).

Continuous Deployment* does not have to be to a production environment. Usually frequent merges, Continuous Integration (CI), and Continuous Deployment (CD) processes are used to promote code to a non-production environment so that automated testing can be performed. It will take time investment to improving the technical practices and infrastructure and in doing so there will be less waste, improved quality, and increased confidence to deploy to production for the customer.

How can we reduce the risk of interferences and side-effects from multiple merges?

Merges do not create conflicts, they only reveal them. Merging more frequently reduces the complication of those conflicts, thus they are more easily addressed.

I hate to pedantic, but Continuous Deployment is to a production environment. Continuous Delivery need not be to prod. There is a difference and it’s a shame “CD” is the abbreviation for both of them.
– RubberDuckApr 17 '18 at 0:47

The definitions seem to be flexible which is problematic. devops.com: Delivery is the enabling pipeline; Deployment is product to any endpoint. atlassian.com and amazon.com: Delivery has a manual production gatekeeper; Deployment does not.
– Alan LarimerApr 17 '18 at 11:59

Based on the definitions, deliver would apply to code being put in any location and deploy would apply if code were dispersed across multiple locations.
– Alan LarimerApr 17 '18 at 12:12