Lean, Scrum and Flow Product Development

Tag Archives: #Lean

This blog reflects my learning process, experiments, and personal experience helping software teams. I have worked with a bunch of exceptional professionals who have suffered many of my mistakes and they have replied to me delivering working software, putting more effort on software quality and even more energy to try new things. Several years ago I committed myself to understanding what software development actually is and how to help those professionals do their best. I think this blog is in the right direction, tirelessly, step-by-step I pay back to them and to the agile community what I owe them.

If you are wondering how my personal purpose and my unpaid debt is related to productivity keep reading. I am going to start by describing a team as a

“network of interconnected work”

Team members who are the nodes of the network, transform, exchange and convert raw information into value to customers. An important characteristic of these networks is that “work” has dependencies between nodes. An event or sequence of events must take place before another, however, the sequence is not predictable. It means that my productivity depends on many other nodes of the network.

In my opinion, many organisations have ignored this remarkable characteristic and they assess employee’s productivity individually without considering the environment. These organisations tend to avoid measuring productivity of the network as a whole.

“You cannot improve what you cannot see.”

The situation gets more and more unfair when employees do not have either control or authority to change how they interact with the network. Edward Deming wrote 14th principles in his brilliant book Out of Crisis:

Remove barriers that rob people in management and in engineering of their right to pride of workmanship. This means, inter alia, abolishment of the annual or merit rating and of management by objective (see Ch. 3).

Another side effect of productivity is busyness. As all nodes of the network are busy, the whole system loses responsiveness and effectiveness needed to react to continuous changes happening around. Busyness organisations want to reach high level of capacity utilisation avoiding idle nodes. An exaggeration might be to create a buffer of work to do just before every node in order to avoid starvation. My untested hypothesis about the expected behaviour is in this case, a network with a poorer global performance and longer time to market. Work to do has to wait in a queue until the node has free capacity to work with it and to dispatch it to the next node of the network, which is also be terribly busy. Hence, work must wait in a queue again.

Not long ago a friend of mine told me that his manager had wanted their teams to achieve maximum capacity utilisation and velocity. Then POs began to take features from here and there to prepare an iteration backlog considering number of people, their skills, expertise and calendar days… It was an exaggeration. Wasn’t it?

“Watch the baton, not the runner”

Productivity and Variability

Software development systems are high variability systems affected by external and internal sources of variability. External sources of variability are mostly rules, policies and events at the organisation level:

Technology: Using immature technology we are exposed to bugs or changes in our technology. For example, lean companies try to use only reliable and proven technology.

Team organisation: Changing team members continuously imply that teams must reorganise and affects negatively their performance. People are not replaceable not exchangeable. Furthermore, space configuration and distance between nodes are barriers to our communication. The likelihood of communicating falls dramatically when distance is farther than 30 meters.

Knowledge or business complexity: Lack of domain knowledge to solve the customer’s problems or constant changes in their preferences are also common sources of variability.

Customer: Lack of involvement or weak support from the customer. When either feedback is too long or is useless from a proxy persona instead of the real customer we can build the wrong system.

Competitors: Competitors’ decisions affect our plans when they bring new products into the market. We should react and inject variability in our project plans.

Waiting for availability: is the time that work is idle waiting for other parts or nodes.

Dependencies or specialisation: It is a significant self-wounded promoted by organisations that encourage high levels of specialisation. This “culture” lengthens our development time and our time to market. We are more exposed to changes in market preferences or competitors.

On the contrary, internal sources of variability are mostly focused on individuals. Intrinsic factors like motivation, healthy or safety among others are very dependent on how we see the world and affect our individual performance. Variability has an important effect on productivity so we might put strong and direct effort on reducing the bad economic consequences of those variability factors in order to increase productivity of the whole system.

Now, I am going to take a different approach and see how to assess productivity through the eyes of Theory of Constraints (TOC). The only goal of an organisation is to make money. Eddie Goldratt who designed TOC considered that Throughput is a powerful metric to measure organisation’s performance. Throughput is the rate at which the organisation converts its inventory of products into sales.

From the TOC perspective, the performance of the development process is affected by bottlenecks, which impede the organisation to achieve its goal. Performance of the whole system is determined by the capacity of those bottlenecks. System performs at the speed of the slowest link in the chain. Whether you increase the capacity utilisation on non-bottleneck nodes of your network, you are not improving the system at all.

“For any resource that is not a bottleneck, the level of activity from which the system is able to profit is not determined by its individual performance but by some other constraint within the system. – Eddie Goldratt”

TOC encourages you to either increase bottleneck’s capacity through improving its process, removing unnecessary work or deriving it to other areas of the system. Finally, we might add more people or required resources if previous actions didn’t achieve the expected results. In any case, we aim at improving the whole system in order to increase throughput.

This is the sequence of steps required to apply TOC:

Identify the constraint

Exploit the bottleneck

Subordinate everything

Elevate the constraint

Avoid Inertia

In theory, TOC is a tool to strengthen the view of the system as a whole through implementing a global metric (throughput) and improving flow whereas avoiding local optimisations on resources that are not bottlenecks.

Conclusions

I hope to give you some arguments to discuss about productivity in your organisation.

Productivity has very harmful side effects for the organisation: longer time to market, waste created by busyness*.

Culture of busyness reduces responsiveness and effectiveness required to adapt continuously to changes. Network and world around us is not static but dynamic.

Identifying bottlenecks is the first step of TOC to improve throughput.

Many opportunities to improve the whole system are manager’s responsibilities.

Either removing unnecessary work or deriving it to other parts of the system is a good alternative to try to improve throughput.

Adding more capacity should be our last choice.

When you prioritise an iteration backlog based on ROI and some team members are idle, take advantage of these visible signals to discuss specialisation, t shape resources availability, team organisation and organisational culture.

This is what I have learnt so far and I wish to write in the future to contradict some of the arguments written here. That would be a signal to indicate that I learnt something new.

“Cost of delay is the language to translate value and impact to our customers into money. “

Cost of delay is the cornerstone of the economic decision making framework, which helps businesses to assess the impact of time on their products and to prioritise their scarce resources on them. Cost of delay puts the tag price on our features and assesses how their value decays over time.

Using cost of delay our discussions shift from the typical labor cost-oriented mindset in which the important topic is what the cost of the feature is to a radically different approach in which we assess the value of the piece of work to do in terms of impact to the business and customers. We model an economic scenario and consider it real when prioritising features or products in our portfolio. Notice that we are replacing gut feeling to using a more scientific model. This model is more adaptable to the complex adaptive system we have to deal with. We arrange experiments and hypothesis using “probe > sense > respond” to learn how the system responds to the stimulus. Cost of delay is a powerful vehicle to harmonise a single vision of the future and to align a common business strategy.

As we have just mentioned, cost of delay is strongly dependent on time and we should depict how time affects product development. @JoshuaJames reflects on 3 different profile life cycle development patterns to describe product development markets.

Short life cycle and sales peak is affected is cost of delay.

This urgency pattern has a very short life cycle and sales are profoundly affected by delay. Consider for example the challenge to release a mobile game. As soon as the product is released, sales ramp up very fast until reaches a peak. Then, sales progressively begin to decay. Life cycle is very short and peak is affected by delay. Whether we release our product too late, our peak is reduced due to the fact that market is almost covered by other titles. At a certain point, when sales begin to decay we must invest in discovering which features can help to stabilise or increase the revenue. An important characteristic of this profile is that exciting features (Kano model) are quickly copied by competitors and become basic needs future products.

Long life cycle and sales peak is affected by delay.

This life cycle profile for certain products also reflects a quick growth nevertheless sales maintain over time. In this case, the first company to introduce the product into the market wins the competitive advantage over latecomers. Cars market or competition between airplane manufacturers is good example of this kind of profile.

Long life cycle and sales are unaffected by delay.

This profile is the easiest one to compute due to profits are sustained over a long period of time. Number of sales is not affected by when the product is released.

Once, we have identified the urgency pattern it is time to decompose value and duration which are both parameters required to compute cost of delay.

The value of the product features was previously introduced here and has to be estimated considering 4 different perspectives:

Increase revenue reflects the revenue provided by new-delighted features (KANO model), which attract either new users or current users.

Protect revenue are small improvements which current users will not be able to not pay any extra money for.

Reduce cost are improvements in our process to deliver value faster.

Avoid cost: costs that are not incurring right now to occur in the future unless some action is taken.

Notice that these perspectives might be complementary and the total value is obtained summing these 4 areas.

Let’s take a hypothetical example. A small company that released a successful instant communication tool is researching on the profitability of adding new features.

Feature: As a User, I want to use voice commands to request the application to dictate messages to the receiver.

Our network of daily active users is 5 millions. Current license price is $10. The marketing strategy is to offer an upgrade worth $5 to current users and hence we expect 10% of daily users to purchase it. We expect our immediate competitors to release their new service in 3 months so we expect to lose 8% of the revenue per month from current active users who would not pay the upgrade every month and 5% value depreciation of the network of users.

Increased Revenue:

We expect 2% rise in new revenues from users who will pay $5 for the new service.

= 2% 5M daily active users * $5 = $500K

Avoid Cost:

Releasing late this feature would decrease 8% of revenue from current users and would devaluate 5% the net value of our network every month. This network is worth $50M today.

= f(g) current users + f(i) network of users

= 8% 5M daily active users * $5 = $2M

= 5% $50M= $2,5M

COST OF DELAY = $500K + $2M + $2,5M

$5M

So, cost of delay is the amount of money we will not make whether that feature is not released on time.

Duration

The amount of time required to release the feature or product to the customer is the second factor required to compute cost of delay. Notice that I prefer making statistical analysis about the performance of the system (historical data) rather than estimating duration.

CD3

So far, we have assessed the list of features in terms of value and duration, however, that is not enough to prioritize and maximise the economics. Product development contains features that usually have different value; urgency and duration so standard approaches like FIFO or LIFO are far from optimising economics. Rather we use cost of delay divided by Duration.

As you can see, cost of developing a product or a feature is not considered when prioritising. Why? First of all, Time is the most critical factor because it is irreplaceable. It cannot be replaced or reversed. On the contrary, funds can be obtained through external sources like financial capitalisation. Also, cost is not a good variable to consider when making decisions due to the asymmetric payoff function of product development. Cost is not proportional to the value obtained. Some research points out that only 30% or 40% of our features can provide up to 90% of the value and we usually only consider cost when making economic decisions. We don’t properly deal with variability and it force us to maximise economics by eliminating all choices with uncertain outcomes.

Finally, I have conscientiously removed the option of adding more capacity because its difficulty to scale in certain situations, especially in later stages of development. Most of the times, adding more capacity leads to communication overloads, and more delays.

“If you bring new people to a product that is late, it’s likely to delay the project even more because of the increased complexity and the need for the team to adapt to its new composition”

Inspect and adapt

As our customer preferences change and competitors adapt their strategy, cost of delay is constantly affected. Our value model needs to be revisited and refined often. Hence, Cost of delay is not a static figure and urgency pattern is a way to create awareness and shared understanding about the economic impact of delays.

How to prioritize

In order to answer to this question, we have a list of features with different value, duration and CD3.

Feature

Value

Duration

Cost of Delay

Feature A

$10K

6m

$1.6K

Feature B

$8K

4m

$2K

Feature C

$27K

14m

$1,92K

The optimal scheduling decision TODAY is to deliver the feature with highest CD3. So, first feature to release would be B, then C and then A.

Next 2 examples are very atypical in software development but it’s worth mentioning them.

When all features have the same value but different duration, very atypical in software development we might use shorter time first (SJF).

Feature

Value

Duration

Cost of Delay

Feature A

$10K

6m

$1.6K

Feature B

$10K

4m

$2.5K

Feature C

$10K

14m

$0.71K

So, optimal selection would be: B, A, C.

When all features have the same duration but different value we might sequence the work to do with high cost of delay first (HDCF).

Feature

Value

Duration

Cost of Delay

Feature A

$30K

5m

$6K

Feature B

$20K

5m

$4K

Feature C

$10K

5m

$2K

Optimal scheduling would be: A, B, C.

Finally, in Flow Product development we measure throughput as the rate at which we convert inventory through sales and value delivered to the customer. Thus cost of delay can be considered a healthy signal of the system. All partially completed features (inventory) are avoiding us reach the goal of making money.

“How much money and time do we spend on features that have not been converted into throughput?

CONCLUSIONS

Cost of delay puts a price tag on our features in order to help you maximise economics and prioritise.

Cost of delay shifts our mindset from cost and efficiency to speed and value.

Not only Cost but also probability are required to make optimal economic decisions. Cost is not always proportional to the value obtained. Asymmetry payoff function of product development remind us that we need variability to create value and short feedback loops to cut wrong paths as soon as possible.

We consider 4 different perspectives to assess value:

Increase Revenue

Protect Revenue

Reduce Cost

Avoid Cost

Cost of Delay is an alternative way to assess the economic impact of the inventory of design in progress.

CD3 is a prioritisation algorithm for work to do with different urgency, value and duration.

Cost of delay can be obtained dividing value by duration.

This blog post is in some way an extract of the ideas developed by @JoshuaJames and Donald Reinertsen.

An interesting opportunity showed up a few weeks ago just after returning from my paternity leave. My brain was still stuck and asleep when my co-workers were presenting me the last “Death or Death” project. They told me we had only 4 months to deliver a new software before the Nothing.

*Thanks to Michael Ende for describing the life of most software developers in this world.

After some workshops that we held to create a common understanding about the problem, we discussed about our critical situation and we decided to promote our agile principles and values even more. We reinforced next ideas:

Delivering the highest quality software iteratively

Building and designing it incrementally

Communicating constantly with customer

Focusing on customer needs

Thus, the following techniques or tools were planned to be used in our last *cough project:

The intent of this blog isn’t to create a prescriptive recipe for your software development process but to provide insights into the reasons to using any specific technique.

#impact mapping
Despite the fact that this tool is designed for strategic planning and very useful for managing projects in the long run. We decided to use this tool for helping the team visualize the product backlog as a report (information radiator) in the short term. This simplification of the diagram seeks to respond to the following questions at first sight:

What the customer wants in plain text

How to provide value. In User Story format. As a I want so that

Sometimes It is needed to break down the “how” item into smaller pieces following User Story format in order to give even more detail about the content of the How node or Epic.

#MobProgramming
This technique which I discovered thanks to the promotion of Woody Zuill consists of one team, one active keyboard and one projector.
As they promote: It’s just like doing full-team pair programming.
There are two roles involved: navigators who discuss, think, design and guide the driver who is in charge of writing the code that navigators are dictating. Every 15 minutes the driver role is rotated.
Team’s feedback after two weeks is terrific. They highlighted the following emergent behaviors:

Alignment: The whole team took part of the architecture. In fact, all team members coded and designed the emergent architecture.

The whole team defined a solid foundation for coding standards and code quality rules.

More meaningful DONE DEFINITION.

They learnt a lot from each other and specially programmers from senior developers.

#BDD
I told to someone some time ago that the book Specification by example from Godjko Adzic had radically changed the way I understood software development. Although, I didn’t have much experience (only 2 projects), it has become an irreplaceable tool for any project that I work in.

As an Agile coach I try to encourage the team to practice BDD and follow next rules:

Technology changes but domain remains. We avoid testing presentation layer and we make effort to test only our business services. Presentation layer is either delegated to exploratory testing or programmed with an automation tool whether it’s worth investing in it.

Using Ubiquitous Domain language. Formal language must be shared by all members of the software development team – both software developers and non-technical team members.

Testing the real system. Continuous Integration machine provides feedback on a daily basis about the health of the system. We are especially interested in performance issues or integration problems with external components. We are promoting to integrate external components as soon as possible or mocking them until a stable version is ready to integrate.

Embrace BDD refactoring. Re-read and re-write your tests many times in order to minimize misunderstanding or ambiguity and search for inadequate feature or scenario definition. The purpose of the scenario is to describe what the system has to do. Likewise, the feature describes the acceptance test. We use the standard agile framework of a User story for its definition.

As I am writing this blog is I wonder about the possibility to create a specification quality control policy which is a check list for early identification of common issues affected by specifications and help reduce rework.

#New Product Development
Although I got excellent results following the Scrum framework along these years I have been progressively more interested in Lean Software Development, Kanban and New product development and less interested in estimations, the usage of them from middle and upper layers, conflicts provoked by estimations and the relationship with team commitment. Thus, our current software development process is a mix of ideas from different sources and frameworks.

Once every two weeks we release a new version and facilitate a review for our stakeholders. This cadence reduces the team’s coordination cost and creates a sense of urgency to deliver value as soon as possible and receive feedback from stakeholders. Besides, we have also limited work in progress (WIP) in order to provide enough flexibility and adaptability to variability. I took a change to facilitate a Systems thinking analysis meeting to create an awareness of potential effects of modifying the value and how to increase and decrease it. The team has managed this WIP internally so far with responsibility adapting it to the continuous context change.

Although our Done definition includes an statement that states: 0 bugs, some of them have shown up  and we decided to create a queue for them. The queue size is very little (only 6 bugs) and team usually reacts very quickly to keep the queue under control. One of the interesting effects of limiting WIP is that developers are capable of dealing with variability (bugs) faster than other teams that I led in the past.

#Visual Management
First blog entry described how we are going to use our metrics cycle time and lead time but it only mentioned cumulative flow diagrams. Now it’s time I explained to you in more detail the usage of this metric.
This metric aims at helping track and monitor how user stories are moving through various stages of the process to being “done”.

From a cumulative flow diagram we can see:

Where the bottlenecks are in our flow. Based on Theory of Constraints introduced by Eli Goldratt we must exploit the bottleneck, optimizing the throughput of the system adding more capacity or changing how the system is performing. Thus, if we apply continuous improvement frameworks like PDCA or Build – > Measure -> Learn we can easily evaluate if our policies are improving the flow.

If demand is seasonal and take steps to adjust capacity in that case.

If we are delivering value at the end of the process and how to improve them (global view of the process).

Finally, we use release burndown chart to indicate the progress of the team against the product backlog. The diagram is updated every week.

Jidoka is the Japanese term used by LEAN practitioners to stop the production line when workers in the factory floor discover a defect. A retrospective is a delayed demonstration of the Jidoka term which aims at evaluating and bringing up continuous improvement actions. Such actions must be focused on the components that interact together to deliver a software product valuable for your costumers: product (software quality), people and processes.

Sometimes, poor variability reduction policies led by team and management layers (including Scrum Masters and Agile Coaches) source problems such as: poor technical skills, requirements ambiguity, poor requirement specifications or scope. Such problems make teams suffer from stressful situations and excess of pressure that must be treated even before the retrospective takes place.

Explicit rules

One of the Kanban foundations for leading the continuous improvement movement is to design explicit process rules which aim to improve flow and reduce risks. As an Agile Coach I have worked together with the team to design a process for the early detection of bottlenecks and “STOP THE LINE” when required.

User Stories

Despite the fact that it’s hard to split epics into small same size user stories, we try to break user stories into vertical slices of the product. In order to provide value as quickly as possible, we ask ourselves the following question:

“What is team going to undertake in only one day?

Bottlenecks, Cycle Time and Traffic Lights

We currently have a visual management metric to measure cycle time and to help us detect bottlenecks.

*Notice that team manually updates the diagram, computing the amount of time it took them to complete the user story. Meanwhile, the agile coach also updates the digital version which automatically collects valuable metrics like the average cycle time, average trend or the standard deviation.

**Horizontal axis depicts the time line and the vertical axis contains the time consumed for each user story.

At the bottom of the diagram, there’s a green area (up to 2 days of cycle time) that indicates that our process is healthy and it’s flowing. No additional action is required.

The yellow area (3rd day of cycle time) is an optional step and we set up a new topic just after the stand-up meeting. Team works together to identify early actions to remove the bottleneck.

At the top of the diagram (more than 4 days of cycle time) we are forced to evaluate the current situation of the bottleneck on a daily basis after stand up meeting. We must focus our attention on the whole process, resources availability and people capacity, work in progress and pending actions. This topic usually takes 10 to 15 minutes.

*Due to the fact that cycle time is a delayed indicator, we are researching to replace the cycle time report with the cumulative flow diagram. It’s a tool used in queuing theory that depicts the quantity of work in a given state, showing work in progress, queue in time, and departure.