How we solved GitLab's CHANGELOG conflict crisis

Since its very first commit more than six years ago, GitLab has had a changelog detailing the noteworthy changes in each release. Shortly after Enterprise Edition (EE) was introduced, it got a changelog of its own. Whenever anyone – whether it was a community contributor or a GitLab employee – contributed a new feature or fix to the project, a changelog entry would be added to let users know what improved.

As GitLab gained in popularity and started receiving more contributions, we'd constantly see merge conflicts in the changelog when multiple merge requests attempted to add an entry to the list. This quickly became a major source of delays in development, as contributors would have to rebase their branch in order to resolve the conflicts.

This post outlines how we completely eliminated changelog-related merge conflicts, removed bottlenecks for contributions, and automated a crucial part of our release process.

At the beginning, GitLab's CHANGELOG file would look something like this:

This resulted in a ton of wasted time as something would get merged, and then every other open branch adding a changelog entry would need to be rebased. The situation only got worse as the number of contributors to GitLab grew over time.

Our initial, boring solution to the problem was to begin adding empty placeholder entries at the beginning of each monthly release cycle. The changelog for the upcoming unreleased version might look like this:

v8.1.0 (unreleased)
-
-
-
-
-
-
-
- (and so on)

A developer would make their change and then choose a random spot in the list to add a changelog entry. This worked for a while, until the placeholders began to be filled out as we got closer to the release date. Eventually two (or more) merge requests would attempt to add different entries at the same placeholder, and one being merged created a conflict in the others.

The problem was lessened, but not solved.

Not only was this a huge waste of time for developers, it created an additional headache for release managers when they cherry-picked a commit into a stable branch for a patch release. If the commit included a changelog entry, which any change intended for a patch release should have, cherry-picking that commit would bring in the contents of the changelog at the point of that commit, often including dozens of unrelated changes. The release manager would have to manually remove the unrelated entries, often doing this multiple times per release. This was compounded when we had to release multiple patch versions at once due to a security issue.

Each changelog entry would be its own YAML file in a CHANGELOG/unreleased folder. When a release manager went to cherry-pick a merge into a stable branch in preparation for a release, they'd use a custom script that would perform the cherry-pick and then move any changelog entry added by that action to a version-specific subfolder, such as CHANGELOG/8.9.4. At the time of release, any entries in the version's subfolder would be compiled into a single Markdown changelog file, and then deleted.

With an idea of where we wanted to end up but no idea how to get there, I started with a spike.

A turning point

After a few days of working on the spike, I had a realization that we didn't need the cherry-picking concept at all:

Cherry picking a merge commit into a stable branch will add that merge's CHANGELOG/unreleased/whatever-its-called.yml file to the stable branch. Upon tagging a release with release-tools, we can consider everything in that stable branch's "unreleased" folder as part of the tagged release. We collect those files, compile them to Markdown, remove them from the stable branch andmaster, and that's our changelog for the release.

This was a major "aha" moment, as it greatly simplified the workflow for release managers. They could continue their existing workflow, and the release flow would transparently handle the rest. It also meant we could handle everything in our release-tools project, which is responsible for tagging a release and kicking off our packaging.

Even though we ended up not using a lot of the work that went into it, my original spike was still valuable. It allowed us to see pain points early on, refine the process, and find a better solution. It also gave me additional experience interacting with Git repositories programmatically via Rugged, and that would go on to be especially useful as we implemented the final tooling.

Building the building blocks

We knew there were several components that we'd need to build:

Something to read and represent the individual YAML data files

Something to compile individual entries into a Markdown list

Something to insert the compiled Markdown into the correct spot in an existing list of releases

Something to remove the files that had been compiled, and then commit the updated CHANGELOG.md file to the repository

All of these components were created in a single merge request and refined through several code review cycles. The commits listed there are all fairly atomic and may be interesting to read through on their own. The code review that happened in the merge request was incredibly valuable, and allowed us to really simplify some code that was hard to wrap one's head around, even for me as the original author!

Automated testing

Of course, we wouldn't consider this solution complete until we had automated tests guaranteeing the behavior and consistency of the automated compilation, including reading from and writing to multiple branches across multiple repositories.

On a stable branch with no changelog entry files, the resulting empty array was passed to Rugged::Index#remove_all which, when given an empty array, removes everything. This was not ideal.

Developer tooling

The final pieces of the puzzle were creating a tool to help developers create valid changelog entries easily, and adding documentation. Both were handled in this merge request.

This tool allows developers to run bin/changelog, passing it the title of their change, to generate a valid changelog entry file. Additional options are in the documentation.

Future plans

This changelog process has worked beautifully for us since it was introduced, and we know it might be just as useful to other projects. We're investigating a way to make it more generic so that it can remove a tedious chore for more developers.

I worked on this project as part of our Edge team, now known as the Quality team. If you're interested in this kind of internal tooling or other automation, we're hiring! Check out our open positions.