Monday, December 3, 2018

Bugs happen. While you could simply fix them, you could instead
take an extra step to prevent similar mistakes from occurring again. This
25-minute process will do that.

We will go in to the philosophies and reasons to do this in other
articles.

1.
When to do it

Do safeguarding right after you’ve fixed a bug. The same day or
next is good. This is when it’s fresh in your mind and when improving the
system still feels relevant.

The key roles that need to be in the room are the people who:

understand
what happened and why

wrote the bug

detected the bug

fixed the bug

project
manager (someone who can approve the time required to work on the fixes)

might
resist the proposed remediation

2. Root Cause Analysis (RCA)

We are going to gather impartial observations about what happened.

Create a Google Doc that everyone can access, with with the
following tree:

What
caused us to write the bug?

Why
didn’t it get caught sooner?

What
made it hard to fix?

Everybody is going to start adding nodes under these three
headings. They will also add questions in response to the nodes. All of this is
done at the same time by all attendees without any talking.

This section is timeboxed to 10 minutes.

3. Vote

We will do a version of dot voting. Everyone will vote on as many
items as they want but no more than once per item. Voting is done by putting
your initials at the front of any item.

Example:

What caused us to write the bug?

[JW, LF] requirements
were unclear

[LF] Name of a
function lies about what it does.

Why didn’t it get caught right
away?

No automated Tests

[JW] Hard to
write automated tests for this section of the code

What caused debugging time /
cost?

Logs were too verbose, so we
didn’t see what was going wrong.

[JHB, JW] Hard to
redeploy site

After voting, copy the top 3-4 items into a section labeled: Remedations

This section is timeboxed to 3 minutes.

4. Budget

Before we come up with solutions, start by asking: was the total impact of this bug small, medium, or large? Have everyone hold up 1, 2, or 3 fingers. Pick the most common answer. Then propose an initial time box:

Small = 1/2 person-day

Medium = 2 person-days

Large = person-sprint.

Since we execute
solutions immediately, this is the point where we need the approval of the
project manager.

Because we intend to do 3
solutions, each solution can only be ¼ of the total budget (to allow some
slack). This means that each item will be budgeted to:

Small = 1 hour

Medium = 1/2 day

Large = 2.5 days

This section is timeboxed to 2 minutes.

5.
Identify Remediations

Next we are going to
brainstorm improvements to our system to reduce the chances of this issue
happening again. Going back to the Google Doc, everyone will silently add ideas
to the second section that we copied from the top brainstorming section.

Solutions that require
extra discipline are bad solutions. We are looking for ways to make success
easier.

Remember to keep in mind
these are timeboxed solutions meant to improve our system and environment as
opposed to solve everything. Often small improvements yield big returns and if
the problem still persists, we will get another chance to do a safeguarding in
the future.

Example

Remediations

Requirements were unclear

Bring users into our grooming
sessions

Earlier usability tests

Do earlier demos

Hard to redeploy site

Write down checklist of
deployment steps

Automate build/test/upload
sequence

Get a second server to deploy
blue/green

After 7 minutes, everyone
votes again just as we did in the last section.

This section is timeboxed to 10 minutes [7 brainstorming, 3
voting].

6.
Add items to task board and do them!

Because safeguarding
already has time approved work on them immediately.

Nothing we’ve done so far
matters if we don’t implement any of the solutions. Make sure that useful
action comes from this exercise, immediately add them to the taskboard (budget
has already been approved in step 4) and start working on them. Remember that
these items are timeboxed and are not meant to completely prevent the
problems in the future, but rather to lessen the chances.

Final
Notes

It is important to remember that safeguarding is a skill. The
first time you do it, be patient and give yourself extra time, maybe an hour.
Also, remember to practice it regularly, usually once per week, as you will get
better at doing the process and finding good remediations.