Follow Streamline Health's Innovations ALM Practices

Streamline Health develops software solutions that support revenue cycle optimization for healthcare enterprises. Streamline’s Innovations department is responsible for creating and maintaining our software products, from requirements gathering through to delivery of releasable code to our client services team. This blog is presented by members of the Innovations team to share our application lifecycle best practices, challenges, and the insights we gain as we grow.

In January, 2015 we implemented the Scaled Agile Framework to manage our five separate software products and development teams. This included formal scrum training for all of our development team members, whether or not they were already “doing” scrum. SAFe helped us to normalize our release planning and code delivery schedule by synchronizing our scrum teams’ iteration cycle. At the enterprise level, SAFe helped us bring diverse development efforts into focus so that resources could be better aligned with corporate objectives. Since that launch we have continued to refine our practices and process.

In 2017, we’re kicking off ongoing ALM training. Our goals are to refresh people across our organization about our best practices, to bring new team members up to speed, and to engage with everyone in the continuous improvement process. Follow along with us as we teach, learn, and improve.

Software capitalization allows companies to delay the recognition of the costs incurred by recording the expense as a long-term asset. The cost can then be recognized when the company actually generates the revenue from it.

Think of it like constructing a building. The builder incurs the cost of design and construction up front – these are called “capital costs.” The capital costs are amortized across the useful life of the building, starting from when it becomes available for use. In this example, the start of availability is most likely represented by receipt of a “Certificate of Occupancy” from local regulators.

In software, the “occupancy” begins with the issuance of a notice of “General Availability” by the developing organization. This means that the company is done testing the software and has made it available for clients to use.

Capitalizing software is an old accounting concept, predating iterative agile processes. Back in the days of “big design up front,” the start of development costs was clear: When the requirements and design were “baselined,” the “real” work began. A notice of “Technical Feasibility” would be published that included the approved requirements and design. Salary and other costs incurred from that point up to the notice of General Availability could be capitalized.

How We Do It

Here at Streamline Health, we’ve adapted our processes to support this “Technical Feasibility” to “General Availability” cycle, all without falling back on big requirements and design.

We currently have five software products under development at Streamline. Some are mature products that aren’t being actively enhanced, but are being maintained, others are being actively enhanced with new features. Each product is an “approved” development project, and a version of each product had previously been made generally available to clients before we formalized our agile processes and our software capitalization. We have assigned a useful lifespan to each product, ranging from three to five years depending on our market analysis and plans for the specific product. The amortization will occur over this planned lifespan.

We use four-sprint planning increments to manage high level roadmaps of features. Before each planning increment (during the fourth sprint of the previous planning increment) our product managers present the roadmap for the next planning increment to both the scrum teams and our executive team. The teams provide technical and content feedback on the managers’ plans and also come up with very high level size estimates so that we can be certain we have enough capacity to do the work. The executives review the plan in terms of sales and marketing needs and costs. The executives might re-prioritize the work to one feature sooner than another, or they might bring something in from a parking lot of other ideas to replace a feature that’s gone out of favor. The result of this session with the executives is a formal technical feasibility statement for each product. The statement describes the each of the features to be built at a relatively high level, and specifies that these features are to be incorporated into the previously approved product.

For the next four sprints, our scrum teams work on user stories derived from these features. They report their time as “enhancement” on the product. We do not ask them to report time for specific stories or even parent features. This would be onerous for both the team members and for managers who would have to create time entry “buckets” for every story or feature in our time tracking tool. They report time spent fixing bugs and on other maintenance work as “maintenance.” This time is not capitalized.

Feature scope is negotiable, as long as the completed work meets the high level description and acceptance criteria of the feature. Teams decompose the feature into stories, and can negotiate with the product manager on the details in order to complete the high level goal within the four-sprint planning increment.

Every user story that is a child of an approved feature has business value assigned. When a feature is deemed to be done, the business values of all child stories is totaled and recorded for the feature. At Streamline we’ve gone to some effort to calibrate our business value across all of our products and product managers so that it’s a meaningful measure of how much our teams have delivered.

At the end of the planning increment we release a new version of each product, and the product managers distribute notices of general availability that list each of the completed approved features. In fact, for some of our products we release more frequently, and for others we do not release even this often, depending on what’s been done and client demand. Salary and other qualified costs will continue to be collected on a product until a general availability notice is issued.

The Capitalization Event

Next, finance matches up the features in each general availability notice with the features in previous technical feasibility statements to ensure that the teams have been working on approved features. They use the business value of each completed feature to allocate the enhancement costs incurred during the planning increment. So a feature that delivered 900 points of business value will be allocated double the cost of one that delivered 450 points.

This use of business value as a “proxy” allows us to reduce time-reporting angst in our development organization, and it has survived independent audit.

Exceptions

The cost of maintaining our software can’t be capitalized, which is why we ask our teams to report this time separately. Maintenance includes fixing defects found in production versions and also keeping our products up to date with external products and integrated tools. Making changes for compatibility with an operating system upgrade, for example, is not considered “enhancing” the product, but is work that must be done.

Are You Capitalizing Development Costs?

Have you find a way to capitalize your software development costs without compromising the guiding principles of an agile framework? Feel free to share your experiences in the comments.

A long-term approach to work that systematically seeks to achieve small,
incremental changes in processes in order to improve efficiency and quality.
–Margaret Rouse, WhatIs.com

At regular intervals the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.
— Agile Manifesto

The Scaled Agile Framework (SAFe) and other agile strategies all emphasize continuous improvement, and at Streamline Health we take scrum team retrospectives very seriously. But it’s useful to take a look at the root guiding principles of the agile strategy: Kaizen.

Western manufacturing concerns began adopting the principles of Kaizen back in the early 2000s when Jeffrey K. Liker published The Toyota Way. To software developers who had signed on to the principles of the agile manifesto, Kaizen’s principles of lean continuous improvement were a perfect fit.

Kaizen culture can be summarized in a few bullets:

All levels of the organization are involved in continuous improvement.

Participation by all levels is essential.

Draws upon varied talents and skills – the more diverse the better.

Peer-to-peer accountability.

Self-evaluation at the team level.

Self-evaluation at the event level.

It sounds simple, but for many organizations it represents a difficult release of managerial control. As recently as last month a director in our company, observing a team planning their sprint work, asked if I step in to resolve conflicts about how they will do the work. The observer was a little surprised when I said absolutely not, they have to work it out. Our teams are generally quite successful, so the observer couldn’t argue that this approach won’t work.

Software development sprints are a series of “Kaizen events.” Improvement goals are set at the beginning. At the end, the team reviews its success at improving from the previous sprint and develops plans for improvement in the next sprint. During the next sprint the team implements the improvements they agreed upon.

Iterative Development

In addition to being the root of agile’s process improvement principle, the Kaizen approach of continually making small improvements applies to agile iterative development. Each sprint contributes a few small, valuable, demonstrable changes to the application. If they’re good, they’re kept, if they aren’t, they’re modified in a future sprint.

Continuous Integration

Kaizen’s principle of peer-to-peer management governs our continuous integration strategy. Everyone is responsible for a clean build, for storing source code correctly, and for organizing test plans and automating testing. Everyone is responsible for maintaining automated build and deploy processes. Code reviews, design reviews, and test plan reviews all draw upon the diverse skills and experience of all team members to improve the quality of their code.

Nobody’s Perfect

The notion that your team members, teams, processes, and products are unlikely to ever be perfect might be difficult for some people to accept. Traditional managers like to expect perfection, even though they rarely see it. But accepting imperfection and making it an opportunity for collaboration and growth is the very essence of Kaizen. Organizations that focus on collaborative improvement tend to see that same collaboration across other work. Teams that accept responsibility for maintaining and improving their process tend to be more successful at regularly delivering working code.

How Are You Doing?

How are you doing at continuous improvement? Do your teams identify and work on process problems as part of their regular work cycle?

Quality assurance does not start with testing working software. Taking steps to build quality into the entire software development process predates iterative and agile frameworks, however, these frameworks support it more effectively than older phased approaches.

Requirements

Building in quality begins with the requirements. There are countless techniques for requirements elicitation and analysis, all of which can be executed well and also executed badly. Knowing what real world problem your software is going to solve is, surprisingly, often the most difficult step. The temptation to create clever software because you can is high, especially for talented engineers. But from a buyer’s perspective, the cleverest app in the world is useless if it doesn’t solve her problem.

Once the problem is understood and decomposed into user stories, keep focus on making every story meet INVEST (Independent, Negotiable, Valuable, Estimable, Small, Testable). During refinement, the team should consider anti-cases and boundary values and look for missing requirements.

Decomposition

When the sprint begins, task decomposition must include all testing activities. Typically a team specifies separate tasks for defining the test plan and executing the tests. Creation of unit tests should be required by the team’s definition of done, so tasks on the board aren’t necessary. Creating automated tests might also be a separate task if the team first executes the plan manually to validate the tests.

Teamwork

Testing is not just the job of “QA.” In fact, a team that thinks of itself as separate teams of “engineers” and “testers” is not a healthy agile team. Certainly every team has coding experts and testing experts, as well as UI experts and database experts and other types of specialists. However, everyone and anyone can execute tests, and even devise test cases. The team’s definition of done should include testing best practices that everyone follows when they pick up test tasks. One old rule that should still obtain is “if you wrote it, someone else tests it.” Teams that practice test-driven development exemplify this, but any team of more than one person can uphold this collaborative rule.

Reviews

Design meetings and code reviews should include testing specialists who look for testability and help verify that the plan meets the requirements. The more testers understand about the underlying design of the code, the better their test scenarios will be, particularly around regression testing. Another huge advantage to these meetings is the conversation itself. The more the team members talk about the requirements, the better each individual will understand them. Engineers do not tend to think like end users no matter how hard they try, so their impulses and priorities while coding will inevitably be counter to what will resonate with the users.

Test Early and Often

The earlier you catch a defect, the less expensive it is to correct. That’s why unit tests are critical. They are relatively inexpensive to create and execute since they are automated to execute every time the code is compiled. They assume the burden of making sure that each method is working as desired.

System or integration testing can also be automated and triggered to run after each code compile. These tests make sure the methods work correctly together and with other components like databases and APIs. System tests are more expensive to create and maintain than unit tests, but like them, they find low level defects that can be corrected before the really expensive functional testing begins.

Let the Machine Rule

Most IDEs (Integrated Development Environments) offer the option for “gated check-in.” Code that fails the rules of the gate is rejected. The gate can be as challenging as the team wants. For example, it could reject code that does not have 100% unit test code coverage. It could roll back the build when any system test fails. It could even execute automated functional tests and fail the build for test failure. The rules that a team adopts to ensure quality code should be documented in its definition of done.

Validate and Verify

Even if the code passes unit and system testing, and possibly automated regression testing, it still might not be right. Functional testing verifies that the software complies with the requirements. Even after all those refinement sessions and discussion during the sprint, sometimes the coders misunderstand the intent of the requirements. If they believe they understand, then their unit tests will test for what they believe and pass.

“Oh, you meant negatives should be red and have a negative sign? But that’s not logical, they cancel each other out.”

Test Design

Coders build. Testers break. That’s why every team needs both. During the same design meeting while the coders are brainstorming how to build the software, the testers are thinking about the vulnerabilities of the design that they’re going to test for. The more they understand about the system design, the better they’ll be able to push its delicate buttons. They’ll test for boundary values and edge cases, and, in thinking like the end users, will often throw scenarios at the code that the coders did not consider. This “white box” or “clear box” testing is critical to building in quality. Tests of the user interface and surface functionality – “black box” testing — while valuable, are less effective at truly hardening the code and can be left to the final stage before release.

Hardening

When the code has passed through unit, system, and functional testing it’s deemed to be “working” code. Technically, it’s releasable – at least to a demo environment. But to be production ready it needs to be “hardened.” In some cases the scrum team can accomplish this during each sprint, but in others, especially in scaled agile settings where the work of several teams must be integrated, final hardening has to be handled in a special hardening sprint, or after sprints by an integration team.

Hardening includes formal regression testing and performance testing. Regression testing should be as automated as possible, which means that someone has to continually update the automated regression tests to account for new and changed functionality. There is a real tradeoff between having people execute tests manually or spend the same time creating, revising, executing, and analyzing the results of automated tests. In other words, automated testing is not faster. It is, however, more predictable and reliable. It also requires a more technical quality assurance team – one that can write test code.

Whether automated or manual, regression testing ensures that the new parts of the software work with what was already there before the last sprint.

Load and performance testing also require specific expertise as well as tooling. Performance requirements like response times should be part of the team’s definition of done to ensure that no new code reduces performance. Some performance testing should be part of every sprint. But as with regression, testing the entire system once it’s all integrated in order to create new benchmarks is critical. What benchmarks are measured depends on the product, but they may include response times for certain transactions as well as concurrent multi-user activity. Simulating hundreds of users performing similar actions at the same time requires specialized testing tools.

Metrics

What metrics you collect depends on your organization’s reporting requirements. A well-integrated IDE should provide you with most of the data you need to measure your teams’ performance and identify areas for improvement.

A high sprint bug volume – that is, bugs created during the sprint and, hopefully, fixed before it ends – suggests that the coders don’t understand the requirements. Sprint bugs that don’t get fixed during the sprint represent accumulating technical debt. If you see this happening, it’s time to stop and focus on fixing the accumulated bugs, but more importantly to focus on finding and fixing the root cause. Knowing who created the bugs but also who wrote the defective code is critical metadata that your process must collect.

Build failure rate is useful to evaluate the team’s engineering quality. While pinpointing individuals is not desirable in a team-centric framework, the team should always be able to know who broke the build so that it can take corrective action. A mature team will automatically assign a co-programmer to help the offender improve her coding practices. A team with a high build failure rate clearly has some internal problems to address.

Depending on the maturity of your code, metrics like unit test code coverage and automated functional test coverage might be of value, primarily if you are focused on increasing these. Even if your teams run at near 100% coverage, checking the metrics on this is a good way to make sure you maintain that level of quality.

Your sales and marketing teams, and your clients, will be most interested in performance metrics. Remember to adjust what you measure as your product evolves, and focus on the scenarios that are of the most interest to your users. Also remember to include all of the steps that your users experience. Even if the UI is the slowest performing part of the system, if you omit it from your metrics you’re not reproducing real user experience.

Tie it Together

Building in quality isn’t automatic. It’s the confluence of deliberate best practices, a quality-driven culture, balanced team composition, and continuous improvement based on fair metrics and self-evaluation. You need quality evangelists and open minded engineers. And you need many hands to help accomplish both building the code and trying to break it.

Some organizations practicing agile simply maintain a prioritized backlog and have their teams pick up each item as they have bandwidth. The product owner and the team know the team’s velocity and therefore have an idea of how many items they’ll complete. At Streamline Health we take a more structured approach for two reasons:

When teams are just forming and adopting scrum, clear goals and commitments make it easier to form disciplined habits. Once this practice is in place, there’s little need to change it.

Sometimes we must make commitments to clients, so we have to know for sure whether an item will be done in the next sprint.

Therefore, at Streamline, our product owners package each sprint and our teams commit to delivering them. Let’s dive in on what that means.

Planning Increment Planning

We first contemplate sprint packages during our quarterly planning increment (PI) planning session. The primary purpose of these sessions is to revisit and adjust the overall direction of each of our products (what the Scaled Agile Framework calls “value streams”). We market four distinct products that are at different points in their lifecycles. At PI Planning, our executive stakeholders have the opportunity to direct more or fewer teams toward one or another product– say, to accelerate feature development for the newest product that sales is pushing hard, while reducing enhancements on one or more older ones.

We move work, not people, so when one product needs more attention we package that work in the sprints of more than one scrum team. The more intense the push on that product, the more team capacity we allocate, even to the point of putting other products on the shelf.

Even a shelved product has users, though, so as we package new functionality into sprints during PI planning we still reserve some team bandwidth to address emergent client issues on them.

By the end of our two-half days of PI planning, we should have the stories ready for the first sprint in the planning increment. They should be decomposed, refined, conditions of acceptance agreed upon, and estimates made by the team that will do the work. When we have multiple teams working on the same product we often do joint estimation sessions, getting an estimate that all of the teams accept.

We should also have a good idea of what stories will be in the second sprint, some with estimates. And we have an idea of the remaining work in the overall feature-level goals for the planning increment. The teams are asked for a vote of confidence on completing the features in the planning increment. More on that later.

What Could Possibly Go Wrong?

With the roadmap laid out during PI planning, why don’t things go like clockwork for at least the first two sprints? Because priorities change. And agile practices were specifically devised to accommodate these changes

Sales might announce with great pleasure that a new client is about to sign a lucrative contract as long as a specific feature is developed within the next four months. Assuming the feature they’re after is in keeping with the overall direction of the product, no product manager could deny this opportunity to boost sales. There goes the plan!

Client Services or Support reports that a couple clients have started escalating their frustration with a bug that you thought wasn’t that critical. You’re at risk of losing client satisfaction if you don’t fit that bug fix into the next sprint.

Midway through the first sprint of the planning increment, the architects realize they missed something. In order for the new functionality to be scalable and meet performance requirements you’ve got to package some architecture work to harden the backbone. The sweetest features are useless if the system doesn’t perform as expected for the end users.

Marketing comes back from a conference with amazing insights. If we add just enough of these new technologies to the product so that we can say we’re doing it, we’ll get ahead of the competition. This kind of insight might not disrupt the sprint packages in the current planning increment, but you might need to do research spikes to prepare stories for the next PI planning session, and that will steal bandwidth from your current plan.

Factors in Creating a Sprint Package

The product owner uses several data points as well as input from stakeholders to pick which stories to package. The first concern is team velocity: how much work can this team typically complete? Velocity is the average number of story points that they’ve finished over the previous few sprints, omitting outliers.

No single item in a sprint should be more than 50 percent of team velocity. So if a team’s velocity is 27, the largest story they should undertake is one with an estimate of 13 points. They should never be given two 13 pointers. That’s the equivalent of putting all their eggs in one basket. Their package should be one 13 pointer and two or three smaller items that give them room to negotiate if they discover a bad estimate. The smaller the user story, the more accurate the team’s estimate. It would be better to decompose stories so that you never even have any that are half the team’s velocity, but sometimes that’s very difficult while retaining adequate business value to justify the effort.

Items must be independent so that some team members can work on one while others work on another and they won’t step on each other’s code. When multiple teams work on the same product this is even more critical. This can be especially tricky if you address the 50 percent of velocity problem by decomposing a larger story into two or three smaller ones, then package them in the same sprint for the same team. Did you really make them independent? Or did you just divide up the list of acceptance criteria into three and call them different stories? Lazy requirements management has been the downfall of many a sprint.

Team and individual skills and knowledge is the third packaging factor. If one team has a database guru and the other does not, push the data-heavy story to that team. If you’ve only got one team and nobody is great with user experience, you may have to bring in some expertise to work with the team, which will reduce their velocity as they learn from the expert.

Hold about ten percent of bandwidth in reserve. Note that your velocity will already account for this, so if you see that velocity is 30, you don’t reserve ten percent of that and only package 27 points – the 30 point average was accomplished with ten percent of bandwidth already reserved.

But why not consume every possible hour the team has available? Isn’t that wasteful? The effectiveness of Lean practices is widely recognized, and reserving some bandwidth for the unexpected is a Lean principle. In order for a Lean scrum implementation to work, you must trust your team. They aren’t a bunch of lay-abouts trying to avoid work. They’re talented engineers who want to produce a quality product. If they really aren’t, then you’ve got the wrong people, but that’s a different blog post.

During each sprint the team may discover the ability to fix legacy bugs that weren’t packaged. They’re working on a certain part of the code and they can improve it, but it will take more time than they originally planned. That’s an extremely good use of that ten percent you reserved.

Emergencies happen: team members are pulled onto emergent client issues; an additional item “must get done” in this sprint (a bad practice, but sometimes unavoidable); the team miss-estimated so something is much larger than planned; multiple team members get the flu.

“Bench” is not a dirty word in Lean practices, and the time will not go to waste. Even if none of the things listed above happen, team members will use the time productively – on additional testing, on research for upcoming work, on more general learning, on cross-training in areas of the code they don’t yet know. And if they find time for a couple rounds of ping pong that has value too.

Priority Within the Sprint

In that simple model I mentioned at the top, the stack ranked prioritization of the backlog is key, but when you work with sprint packages, the product owner is less focused on the individual priority of each item in the package. They want it all done. However, the team can use individual item priority to help them plan their sprint.

Since the goal of an agile process is to deliver business value, the priority of the items in a sprint should be based on their business value. However, bugs don’t have business value (you already got credit for the value of the feature even though it had the bug, so you don’t get more credit for fixing it). And sometimes fixing a bug is the most important thing to do in a sprint.

Streamline uses Microsoft Team Foundation Server to manage all scrum work, and the form for bugs (rightly) does not include a business value field. So we use the Backlog Priority field, available on both bugs and stories, to prioritize the items in each sprint.

Sprint Goals

The product owner specifies two or three key deliverables for each sprint that represent its “essence.” These goals can relate to just some, or to all of the items in the sprint package. If there are items in the package that don’t fulfill the goals, they are, by definition, lower priority and the product owner’s stack ranking of the items should reflect that. Successfully meeting the sprint goals, as judged by the product owner, means the sprint is a success, even if the team doesn’t complete all of the packaged stories and bugs. This flies in the face of our metrics, though, which are based on pure data as I’ll discuss in a moment.

Commitment

At the end of PI planning the teams are asked for a “Fist to Five” confidence vote, fist being “no confidence” and five fingers being “confident.” If the teams vote lower than fours and fives, the product owner leads a discussion of the concerns, and it might lead to some further decomposition of the planned features or a reduction in planned scope.

This slideshow requires JavaScript.

Managing the Package During the Sprint

After a confident kickoff the team plans the sprint by breaking down the stories and bug into one and two-day duration tasks. When this is done, if the total estimate, in hours, for the tasks is less than the team’s available time, they’re golden. But what if their task estimates add up to more time than they expect to have?

First they review the items in the package. Have they misunderstood anything? Probably not. Most likely as they planned they identified more complexity than they saw during backlog grooming and estimation. It happens, and that’s why there are so many checkpoints in the process.

First the team should re-estimate the problem story – maybe it was estimated at 8 points, but now they think it’s a 13. The team goes back to the product owner to negotiate. They can reduce the scope of one or more items by removing some conditions of acceptance, then re-estimate them to see if they’re back within velocity. Or they can remove one or more items from the sprint. What gets removed is the product owners’ decision. Whether it’s enough is the team’s.

After negotiating, the team does another commitment vote and, hopefully, digs in on the work.

Success (Failure)

What happens if a team doesn’t finish everything?

The incomplete items are returned to the backlog. They may or may not go into the next sprint – that’s up to the product owner. If something is partially completed, the team is responsible for making the code whole – they might have to shelve changes that are incomplete or untested so that the completed code can be delivered. The team must re-estimate any items that they’ve done some work on, so that their story point estimate reflects the remaining work. Sometimes the estimate won’t change, meaning that the item was underestimated at the start and the team didn’t catch it early on. That’s a sign of an immature team, and something to be addressed in the retrospective.

Metrics

Traditional project management methodologies run on lots of metrics by collecting lots of data – timesheets, bug tracking, test execution, and so on. At Streamline, the first principal of any metrics around scrum is that we don’t go deeper than the team. We do not look at metrics on individual team members. Obviously we collect data – timesheets, metadata in Team Foundation Server, and so forth. So we are capable of analyzing an individual’s contribution should the need arise, but we do not do that as a matter of course.

Team and product metrics are what matter. Three useful metrics that we look at are:

Sprint Success – did each team finish what they committed to? This is a data-based metric, not a sprint goals assessment, so it’s a little harsher than the more popular goals-based success. We track whether each team delivered to their commitment, over delivered (by taking on additional items after kickoff), or under delivered. While an under delivery is a failure, we do show how much they did complete rather than simply call it a zero. Sprint success is reported to the organization in a monthly round-up of metrics from all areas.

Velocity – how much did they get done? As discussed earlier, this is used by the product owner and team to determine how much to package next time. It’s data based, but adjusted to omit outliers (like that sprint when three team members got the flu and they didn’t deliver two stories). Velocity is not formally reported to the organization, but rather used by the product owner in packaging.

Business Value — Delivering value to our clients is the primary goal of agile. Business value added by product is a useful trailing indicator that you are indeed enhancing the product that’s the highest priority. For that to work, you must have parity across all products and a baseline for each one (“product X was worth 10,000 points of business value and we added another 300 this sprint”). We’re not quite there yet with this data, but it’s in our plan.

How Are You Doing?

How are you doing at managing sprint packages through their lifecycle? Do you track any other metrics that you find valuable? Let us know in the comments.

They are manifestations of team norms throughout time, driving organizational culture towards better performing teams.

Team norms are important for establishing a shared belief that the team is safe from interpersonal risk taking. Teams norms express intentions of individual team members as a collective whole. They represent an opportunity for individuals to express what is important to them and to learn what’s important to their teammates. Team norms pre-empt unforeseen situations by providing a context to proactively discuss grievances about team behavior and prevent frustrations from festering. They ultimately help establish trust among team members and optimize team performance.

At Streamline Health, when we form a scrum team, their first sprint is a “Sprint 0” during which they define their team norms. The practice may be applied more broadly to other teams to discuss and agree to norms at the beginning of an engagement. Team norms may be used as a tool to reset anti-patterns in a non-confrontational way. If people sometimes forget the norms or unintentionally violate them, having agreed to be governed by them enables team members to remind each other of the ideal behavior.

Using an analogy found in nature, team norms are the visible surface of an iceberg. They should be documented and posted somewhere for all to see and refer to in times of conflict. There is also a large portion that lurks beneath the surface, the surface does not move without the hidden mass below.

At a macro level, according to various sources on Scrum Alliance, good team norms empower the team to deliver value at the end of the iteration. They do not define a role for individuals as the team is more important than the individual. They empower the team to take ownership for the work and each other. Good team norms also save the team valuable time, forgoing discussing the things that are regulated by them, and focusing on the work instead.

Norms can be at any scale, ranging from agreement about how decisions get made to basic principles about how the team communicates on a daily basis. Maybe the earliest recognized manifestation of team norms for software development, the Agile Manifesto (c. 2001), emphasizes that while there is still value in the items on the right, the items on the left are valued more. At Streamline Health the Agile Manifesto is the genesis of many team norms.

At a recent lunch and learn associates across the organization came together to discuss team norms and how they contribute to a high performance commitment culture. We discussed specific examples of team norms that can address any aspect of the team’s functioning, such as safety, expected work hours, communication response times, or meeting attendance. Norms that address a team’s operating rhythm, communication, decision-making informed by a definition of done, and accountability can have a big impact on team cohesiveness and performance. We also discussed examples of good team norms that address individual team member’s egos, such as avoiding hidden agendas, being open minded, and admitting it when you don’t know the right answer. These are some of my own personal favorites. Team norms should not be set in stone and may evolve over time as team culture changes. Streamline teams are encouraged to review their norms early and often and to use the sprint retrospective as an organic cadence to review and refine them.

Back to our nature analogy. If team norms are the visible surface of the iceberg, what is the portion that lurks below the surface? Culture. Culture is the critical mass that is strong enough to puncture holes in titanic ships. It often lies beneath the surface. Culture is what team norms require to be sustained over time and space.

In 2013, Google made this discovery: Who is on a team matters less than how the team members interact, structure their work, and view their contributions. Instead of stocking the team with individual stars, it’s more important to have empathetic team members who listen to others and can make the team greater than the sum of its parts. High-performing teams, they found, displayed five characteristics.

Psychological safety

Members feel they can be vulnerable. They know their ideas and opinions will be respected and considered, even when they conflict with those of the rest of the team.

Dependability

Members are confident their coworkers will deliver what they are supposed to when they are supposed to.

Structure and clarity

Members understand their roles and the roles of others, and the goals of the team overall.

Meaning

Members feel that what they are working on is important to them personally.

Impact

Members believe what they are doing will have a positive effect on the organization and the world.

Google sales teams with the highest level of psychological safety outperformed their revenue targets, on average, by 17%. Those with the lowest psychological safety underperformed, on average, by 19%.

Another takeaway: the effectiveness of teams that were very high in dependability was actually impeded by a lot of structure in terms of role definitions and goals. By contrast, the teams with low dependability benefitted greatly from structure and clarity. That’s a useful insight for new teams, where members don’t yet know whom they can depend on.

According to their research, by far the most important team dynamic is psychological safety — the ability to be bold and take risks without worrying that your team members will judge you.

The High Performance Commitment Model posits that if we apply the elements of cross-functional, focus, purpose, psychological safety, trust, and transparency over time we may achieve a high performance commitment culture. Therefore, for a high performance commitment culture, start with team norms that enable these elements.

How much wood would a wood chuck chuck if a wood chuck could chuck wood?

When planning work for an upcoming planning increment or sprint, capacity is your available bandwidth for the iteration measured in a well-understood standard unit. At Streamline we use three different units depending on the level of granularity of the requirement objects we’re planning. We touched on these units in the entry on estimation.

Practitioners of traditional project management sometimes think that agile processes are informal because they can’t get a formal estimate in hours for a months-long work plan. The PMP and the Agile Project Manager go around like this:

PMP: “How can this development team possibly know whether they can meet the client’s deadline if they don’t dig in and come up with a detailed estimate before starting?”

APM: “Have you ever seen an estimate stand unchanged throughout the duration of a project?”

PMP: “No, of course not. That’s what change management is for.”

APM: “So even if I were to commit my people to producing a detailed estimate now, you agree that it would be wrong?”

PMP: “Certainly not. I expect your people to provide an honest, accurate estimate!”

APM: “They always do, based on what they know today. Tomorrow they’ll realize they misunderstood the impact of some of the requirements. Or the requirements will change. If their initial estimate was based on high-level deliverables, we’ll still meet the deliverables while managing the details underneath.”

PMP: “I still don’t understand why you can’t just give me an estimate in hours…”

APM: “Fine. It’s eight thousand hours.”

PMP: “Thank you.”

The Agile PM knows that eight thousand hours is the typical capacity of her four scrum teams across four sprints, and the project in question has a deadline that’s five sprints out. She could have said “four sprints,” or “five hundred story points,” but the PMP only works in hours. Sometimes you’ve just got to speak your audience’s language.

Planning Increment Capacity

At Planning Increment (PI) Planning, we’re planning features, so we think of capacity in terms of sprints. A planning increment is four sprints long, and we have four scrum teams. So our PI capacity is sixteen sprints. We’re not yet ready to estimate the child stories of the features, and we haven’t even discovered all of them. At this level we estimate features in “ideal sprints” – how many sprints would this work take if it’s the only thing that one team worked on? A feature that’s estimated at four sprints will absorb all of the capacity of one team for the entire iteration.

Such single-treading is not likely to be acceptable – we always need to get some other items done too. We decompose features that are estimated at more than three sprints and push the lower priority functionality to the next planning increment. As often as not, those lower priority parts of the feature will end up being changed or dropped anyway.

Sprint Packaging Capacity

When planning sprint packages, the capacity currency is story points, and we base our assumption about each team’s capacity on their past performance as measured by their velocity. Velocity is the team’s average story point delivery over multiple sprints. When using it to plan capacity, it is imperative to consider any special circumstances for the coming sprint. Is anyone taking an extended vacation? Do we expect any contention for team members’ time – like a new client going live? (In a perfect world, client go-live would never impact the development organization, but our world is not perfect.)

While at PI Planning we’re thinking in terms of the average team, at sprint packaging we know which team we’re working with and we know their actual velocity, which might be below or above average. We’re also using the estimates on the user stories that they provided. Across our organization, sprint packages can vary by ten to fifteen points due mainly to the teams’ variance in their estimation scales. They aren’t supposed to vary, they’re all supposed to be using “Ideal Developer Day” story points. But the reality is that they do vary, and we very consciously never compare velocities across teams.

Sprint Planning Capacity

After sprint kickoff, each team digs in and plans their tasks for the sprint. Before they start decomposing and estimating, they spend a few minutes estimating their capacity. Microsoft Team Foundation Server (TFS) allows each team member to say how many days they’ll be unavailable (if any), and also to specify how many hours a day, on average, they’ll be devoting to the team. We assume six hours out of our eight hour day, allowing the rest of the time for non-team meetings, training, and admin tasks like timesheets and, you know, getting up and walking around. (Before you ask, lunch is not part of the eight hours.) If someone is a manager, or they have other responsibilities (for example, our architects are on both the architecture team and a scrum team, so their scrum team capacity is reduced), their hours per day will be less than six. TFS also knows how long the sprint is, so it calculates the team’s total capacity in hours for the sprint. Then the team estimates their planned tasks. If the planned tasks exceed the available hours – the sprint capacity – they go back to the product owner to renegotiate the package.

Epilogue

PMP: “Thanks for the demonstration. The client is thrilled. I’m amazed that you guys pulled it off without really knowing how much work it would take. I know you made up that estimate of eight thousand hours.”

At last we come to what some experts say is the most important scrum ceremony of all: the team retrospective. This is the team’s moment for self-reflection. You might also hold retrospectives that include stakeholders and others. Those have a different purpose from the team’s retrospective on their internal workings.

The Team Retrospective

One of the hallmarks of a successful scrum implementation is continuous improvement. Teams that do not conduct regular self-inspection, and that do not find ways to do what they do better, are destined to go the other direction.

After every sprint, it is critical that each scrum team take a few minutes to review. They should use this time to recognize success as well as identify weaknesses. While the sprint reviews and the demo offer the opportunity to recognize their work product, the retrospective is when they recognize how they work.

The retrospective must be a peer review – rank or management relationships are irrelevant. Every team member must be empowered to speak their mind. If the team’s culture doesn’t allow frank discussion, then it’s not a viable scrum team. You need to fix that before the team can achieve truly great agility.

The retrospective is the litmus test for a team’s maturity.

Agenda

There are many ways of running a peer review or post implementation review. In all of them, the goal is to gather input from everyone without judgment, document it, and identify the most critical concerns – the areas that it’s most important to improve on.

The agenda we use at Streamline Health is the “round the room” technique:

Go around the room, asking each person to offer one thing that went well during the sprint. Write each contribution down as stated. Keep going around the room until nobody has anything else to add. But enforce the “one item at a time” rule. This prevents any one person from dominating.

Then go around the room asking each person for one thing that needs to be done better. Language here matters to people – don’t ask for “what went wrong” or “what failed.” This allows team members to suggest improvement to their peers without actually saying “you failed at …” As with the positives, keep going around the room until nobody has anything else to add.

Typically, the “things to improve list” is longer than the “we did great” list. That’s good. It’s a sign of a team that recognizes its weaknesses and wants to improve.

The final step is voting. Each team member gets to vote on her top three (or four, or two — whatever seems doable) items that need improvement. The three that get the most votes are the ones that the team will work on during the next sprint. Sometimes these require external help, and when that’s the case the team needs to assign a member to seek it out.

Bad Retrospective Smells

A subset of “bad scrum smells” is that whiff of an ineffective ceremony at the end of each sprint. Here’s what we mean:

The team raises the same “what went wrong” items sprint after sprint. This team is going through the motions of the retrospective, but not actually focused on continuous improvement. They’re not trying to fix the top problems. They should include tasks in their planning that touch on the items to improve. For example, if they’re failing to comply with some part of their definition of done, put a “confirm DoD” task on every item in every sprint until it becomes habit.

Many of the “needs improvement” items are external. A team that blames others for their problems is not taking ownership of their work and their process. Items like “The requirements were bad,” “system engineering didn’t keep our build machines up,” “the SME wasn’t available enough,” are all legitimate issues that a team might face, but in their retrospective they should be stating them as challenges that they need to solve. They could be stated as “We need to focus more on pre-sprint backlog refinement,” “we need to check the build machines every evening and get a ticket in to system engineering the moment we see a problem,” “we need to make some standing appointments with the SME rather than calling her ad hoc and not reaching her.”

Some team members “pass” on every round. If one or more members aren’t offering observations about problems, something – or someone — is suppressing them. It can be difficult to identify the root cause of non-participation in the retrospective. Whatever it is, this team is destined to never achieve high performance, so intervention is critical.

Is the moderator neutral? Does the moderator (or scribe if it’s a separate role) edit the team members’ contributions, or write them down as stated? It’s okay for the moderator to ask “do you mean the same as what Joe said a minute ago?” and if the answer is “oh yeah, he did say it,” then no need to duplicate. But if the answer is “no, I meant…” and the team member provides differentiation, write it down. It is never okay for the moderator or scribe to say “I don’t agree with you” and not write down the team member’s contribution.

Is a manager in the room? This is a peer review. Reduce attendance to just team members. Some agilests consider the product owner part of the team. The product owner passes judgment on the team’s work, so he’s not a member of the team. If the team has concerns about their relationship with the product owner, they need space in the retrospective to expose and discuss them. Leave it to them to invite the PO in, or not.

Is someone a bully? Is there a team member with strong opinions who manages to dominate even though the moderator tries to run a balanced meeting? If this happens repeatedly, other team members will give up and stop participating. If you see this, it may be necessary to address the behavioral problem through the person’s manager.

Results

The results of the retrospective including all items raised (good and bad), and the top three items to improve, need to be documented where the team can see them. At Streamline our developers live in TFS, so that’s where we capture them. Each retrospective is documented in a work item assigned to the sprint. All team retrospectives are visible to everyone, and they’re immediately accessible without having to leave the development tool.

We don’t package them into the next sprint so that the team can plan tasks for the items to improve. But that might be an effective way of helping teams remember to act on them. We’ll report back here if we take that step.