Swarms: Adapting Guilds To Scale Agility

In my first article in this series, “Using Guilds to Combat Information Silos,” I reviewed the adoption of guilds at xMatters. As our company has grown, we’ve used guilds to keep critical information from languishing in silos—as often happens when an organization scales. Guilds have helped our engineers apply their creativity and critical problem-solving skills to other channels and contribute to areas outside their normal scope. It’s a win-win situation for everyone who participates—and especially for the company.

Implementing guilds began as a grand experiment. We quickly learned a lot about what enhances or hinders a guild’s success, making adjustments and adding a necessary dash of protocol along the way. But, as is the case with any growing organization, new needs and pain points challenged us to adapt and tackle them. One attempt to solve a critical problem led to an entirely new team model: a body we call the Swarm.

The Swarm arose as a result of key changes in our release protocol. We used to be very focused on getting new features out the door as fast as we could. As it turned out, many of our customers prefer a more predictable schedule. So we gave our customers flexible options by maintaining continuous delivery of code to production, with flags to stage larger features or more visible changes. Now, customers who wanted more predictable updates could opt for a quarterly release model while others could continue to access features as they became available and were being iterated on.

While the quarterly release model improved our rhythm for customers, it also created stricter deadlines. A project that just missed a quarterly release could get delayed by almost another full quarter before our customers could start using it. Throwing more engineers at a problem doesn’t necessarily solve it faster—but, being the engineers we are, we thought about ways to parallelize the problem.

We knew that guilds were very good at collaborating around a shared problem, so we adopted temporary guilds, which “swarmed” around a shared problem, completed work at an accelerated pace and disbanded. Centered around a specific initiative with a definitive start and end, multiple teams from Dev to Ops collaborated to complete a defined project. For instance, we might form swarms to create new product features, and their lifetime would span a finite number of sprints across multiple teams until completion. Due to its nature of focusing on new product features, swarms differ fundamentally from Guilds in that they are typically not self-forming; each one is an entity intentionally designed for a specific purpose.

We began our swarms experiment with a large architectural change to roll out a new format for our customer hostnames, which involved six Engineering and Ops teams. We realized immediately that this would be an incredibly challenging undertaking on our tight deadline. A slew of teams was working on different stacks in different programming languages, so there were many competing visions and different contextual understandings of the end goal. It became very apparent that the swarm could not be left to operate in a completely flat hierarchy, as a guild usually does. We needed a leader to help ensure there was no duplication of work, things weren’t falling through the cracks and necessary dependencies were completed in the scrum storywriting between teams.

Having someone take care of all the logistical problems is one thing, but we realized that in-person communication between all the teams in the swarm is absolutely necessary. We tried staying in the loop with each other by defining a process of posting updates from each team in our swarm’s Slack channel, but this was not enough to keep the project on track. Mattius, one of our very talented operations leads, initiated regular in-person meetings across all the teams, which led to far greater clarity. It was absolutely crucial to have all of the teams together and face to face on a regular basis.

In the end, swarming on this massive endeavor allowed us to get the project delivered a week before the deadline. It was a challenging and imperfect first run, but it absolutely paid off.

I leave you with these three recommendations:

Create each swarm with a clear purpose and deadline;

Have a leader to enforce consistent communication; and

Have regular face-to-face interactions.

The lessons we learned were invaluable for refining (and continuing to refine) our approach to tackling projects for faster release, and we’re continually working to improve and enhance this model. Good luck with all your future swarms.

About the Author / Nick Fletcher

Nick Fletcher, Engineering Manager at xMatters, started designing and developing websites back in 1998, when the World Wide Web was the Next Big Thing. In the time since that first website, he’s mastered all the aspects involved in creating great products: design, development and strong leadership. Connect with him on LinkedIn.

Modern practices, increasing agility, cloud development, and DevOps are shifting the focus from the traditional perception of security. DevSecOps is a set of practices that help the industry keep pace with innovation while making sure everyone is responsible for security, and that it is dealt with as early as possible ... Read More