Topics

Featured in Development

Understandability is the concept that a system should be presented so that an engineer can easily comprehend it. The more understandable a system is, the easier it will be for engineers to change it in a predictable and safe manner. A system is understandable if it meets the following criteria: complete, concise, clear, and organized.

Featured in Architecture & Design

Sonali Sharma and Shriya Arora describe how Netflix solved a complex join of two high-volume event streams using Flink. They also talk about managing out of order events and processing late arriving data, exploring keyed state for maintaining large state, fault tolerance of a stateful application, strategies for failure recovery, data validation batch vs streaming, and more.

Featured in Culture & Methods

Tim Cochran presents research gathered from ThoughtWorks' varied clients and projects, and shows some of the metrics their teams have identified as guides to creating the platform and the culture for high performing teams.

Continuous Delivery - It’s Not All about Tech!

Key Takeaways

People and communication issues can add hours, or even days, to your release cycles

Visualise your system to see where the problems and bottlenecks are

Learn to observe objectively - be aware of your biases and opinions

Use data on how your system is working to focus improvement efforts

The Toyota Kata does help focus your process improvement efforts

As are many organisations, we were on a journey to continuous delivery. There was the obvious work needed to build the automated testing and deployment pipelines, however we felt that there were other factors contributing to why our releases were somewhat stressful and bumpy; we just weren’t exactly sure what the problems were.

I started observing our release process and mapping what I saw. Objectively observing all stages of our releases in action allowed me to measure our release process. For example, the time it took to complete each step of the process whether manual or automated, the number of people involved from different roles, and at what point we would discover an issue that blocked the release. Once we had enough data, we started analyzing it and looked for the bottlenecks - these became our most important problems to solve.

It’s important to be objective when observing; to not let your own opinions and biases cloud the facts of what’s actually happening, or not happening! Objective observation is skill well worth practising - it’s the basis of Gemba and experimentation so very valuable when trying to improve your processes and systems.

Why focus on improving our release process?

Related Sponsored Content

Related Sponsor

It was simple, really. Our releases weren’t as smooth, frictionless or as frequent as we wanted them to be, and that impacted our ability to deliver. Although my role at the time was Exploratory Tester in a cross-functional team, I have always been interested in improving the systems we work in. It was a natural step to get involved in improving our release process.

The good thing about working with technologists is that they respect data and facts, and are very keen to make things better! As I had collected a lot of data that showed where our problems actually were, it meant we focused on fixing the real problems rather than working on assumptions. For example, the data showed that we were deploying far fewer kits to our test staging environment than we thought we were, so we dealt with the problems that were stopping us from deploying them easily.

Stabilize the release - not everything has to be released now

Having discussions around if something was really important enough to release was hard. The problem was that often when we wanted to pick the release candidate for the week, there would be last minute “urgent” changes that teams felt needed to be included for a variety of reasons. We had around eight teams at the time, so this was a real problem for us. It would mean delaying selecting the release candidate and the knock-on effects on the release cycle or the work would have to be rushed which can sometimes impact quality. Basically, the stress was being caused by only being able to deploy to production once a week, and until we could deploy more frequently, we would continue to have that weekly stress unless something changed.

I started to ask questions like, “Why does it have to go out in this release?”, “What happens if we don’t release the change this week?”, or “Can it wait until the next release?”. It was quite hard having these conversations, as it was already a stressful situation for the team and I may have been viewed as preventing them from getting their changes out. It did, in fact, feel counterintuitive to be saying that a change can wait another week when we were trying to release more frequently. The key thing at the time though was to stabilize the releases.

By having the discussion and asking those questions, we realized that some of the changes weren’t urgent at all and therefore didn’t need to be rushed. The overall effect of that was our releases were smoother, with fewer patches to fix escaped bugs. Obviously, there was less stress too! We could then focus on releasing more frequently.

Using data to identify bottlenecks and change habits

Changing habits is hard. A habit is something you automatically do so, you have to work at stopping doing the old habit and replacing it with a new one. Publishing the data that highlighted the problems in our release cycle helped create an acceptance of the problems, plus a will to fix them.

We tried a few things to help us form good habits. For example, we had hours of delay due to the people involved in the release process sending emails as the primary communication method. An email would get sent, a message was communicated, job is done. However, until the recipient has read and understood the email, you haven’t communicated anything. If they are in a meeting, or only check their emails once or twice a day (a good habit!), then it could be hours before they see it. To quote a friend, Rob Lambert (@Rob_Lambert), “communication is in the ear of the listener”.

To change this email habit, we introduced physical handovers for a while. We bought a Hollywood clacker board, wrote the release candidate version number on it, then passed it on to each other like a relay baton. If it was your turn to have the clacker board, you knew that the whole release cycle was waiting for you to do something! That’s quite motivating to actually go and do whatever it is you are supposed to do! It also helped get that sense of the release going out being the most important thing.

There was another purpose to the clacker board though - to get people talking face-to-face, to get to know each other and start to feel like a team. It was gimmicky, fun, and made us laugh. It did the job - it helped us eliminate the queues in our process caused by poor communication, made us work as a team, and helped us form a much better communication habit.

In summary, work on replacing an ineffective habit with a more useful one - be creative and find fun ways to get people involved. Habit stacking is another good approach, which is where you add the new habit you want to create on top of another good existing habit. I highly recommend Helen Lisowski ‘s “Power of Good (Agile) Habits” workshop, if you’d like to know more (@HelenLisowski).

Experimenting with the Toyota Improvement Kata

When I first started observing the releases with the view of improving them, I didn’t really know about the Toyota Improvement Kata! I’m lucky in that I work with very experienced Agilists, Lean and System thinkers so I soon learnt. As it turned out, my whole approach of observing, collecting data, and analysing it in order to know where to focus our next experiment was in effect, the Toyota Kata.

The Kata is about applying a scientific style of thinking to understanding your problem, what you think will happen next, and adjusting your next steps based on what actually did happen. It has four main steps:

Step 1:

Set the direction you are aiming for, your challenge, your goal, your true north. For us this is release on demand. That doesn’t mean that we release every minute; it means that we can release in whatever cadence we want to in a smooth and frictionless way.

Step 2:

Know your current state, your current condition. This is where that data we collected and analysed came in. We basically had a Value Stream Map in spreadsheet form.

Step 3:

Establish your next target condition, i.e. your first milestone. Often it’s too big a jump to go from your current condition to your end goal, otherwise we would have already done it! In order to make progress, it’s helpful to identify an intermediate goal that is more achievable. For us this was reducing the release cycle from 4.5 days to 2 days.

Step 4:

Decide on and conduct experiments to get to your target condition. This is where the data analysis came in again. The data showed us our biggest problem areas and where our queues were. I think of queues as dead time - nothing is happening, we are just waiting for the next bit of the process to happen. We focused our first experiments on eliminating or at least reducing those queues.

Using the Improvement Kata way of thinking helped us pinpoint what change we wanted to make next, resulting in us changing some habits and automating manual steps. For example, the experiment using the clacker board to improve communication that I mentioned earlier, came about as an experiment we devised via the Improvement Kata.

Another example of a change that came about via the Improvement Kata is the automation of our green kits to our test staging area. It sounds like an obvious thing to automate (and it is), however at the time, we thought that it was broken builds that were mainly holding up the deploys to staging, rather than the deploy being a manual process. Collecting objective data that showed our current state (step 2) highlighted that actually, we weren’t deploying most of our green kits at all! Being a manual process, it suffered from people not being available, or people being distracted by something else. Automating that deploy became the next action (step 4).

Benefits from improving the non-tech side of releases

The obvious benefit has been the reduction in cycle time. With a 4.5 day cycle time, we just couldn’t release more than once a week. After running several experiments and of course, working on automating the delivery pipeline, we more than halved that and could release in a day if we pushed ourselves.

Some other important benefits include:

Better communication and working relationships

Automating deploys to the test staging environment meant teams could finish their story testing more quickly which reduces story cycle times and the feedback loop

Improving our worst performing automated tests

Reducing the cost of regression from 6-8 people for up to 2 days, to 1 person for around 30 mins

Fewer patches to fix escaped defects.

Some lessons learned

Here are a few things I learnt:

Having data that show patterns and trends counts. It means you can really show where the problems are and also show how the changes you make improve things, or don’t as the case may be.

Objective data and analysis is very useful when trying to convince management to invest time and resources into a project! Those graphs are powerful!

Whilst CD efforts are primarily focused on technology and automating pipelines etc, the people side of things can also have quite a dramatic effect on your cycle times. Find out where your bottlenecks are.

Open plan offices and easy access to fellow colleagues involved in the release process does not mean they communicate well or work as a team

I don’t have all the good ideas! Ask for help, get your colleagues involved (thanks Gemma Lewington for the Hollywood clacker idea).

It’s easy to get caught up in the technical side of continuous delivery and only focus on that. After all, that’s where most of the advice and skills development resources are focused. It’s obvious, and of course it needs to be done. However, do take a look at your overall release cycle. Depending on where you are in your pipeline automation journey and how long that journey is going to take you, there may be other non-tech factors hindering your releases in the meantime. Understand all the steps in your release cycle processes, find the bottlenecks and queues, make sure your communication methods are effective, and that all the people involved are genuinely working together well.

About the Author

Sylvia MacDonald started her career as a developer, became a software tester and is now an engineering manager at NewVoiceMedia. She has delved into other industries such as retail customer services and Montessori education. She is passionate about building in quality, helping teams understand business agility, and improving work flows by spotting problems and helping to remove them. MacDonald hosts and organises the Reading Tester Gathering Meetup group. You can find her on LinkedIn, @Sylvia_MacD, and sometimes presenting at conferences.