GitLab might move to a single Rails codebase

A single repository with no license changes

Before we go into the details of the proposed changes, we want to stress that:

GitLab Community Edition code would remain open source and MIT licensed.

GitLab Enterprise Edition code would remain source available and proprietary.

What are the challenges with having two repositories?

Currently the Ruby on Rails code of GitLab (the majority of the codebase) are maintained in two repositories. The gitlab-ce repository for the code with an open source license and the gitlab-ee repository containing code with a proprietary license which is source available.

Feature development is difficult and error prone when making any change at GitLab in two similar yet separate repositories that depend on one another.

A simple change can break master

Conflicts during preparation for regular releases

This concerns preparation for a regular release, e.g. 11.7.5 release. Merge requests preparing the release for both the CE repository and EE repository need to be created and once the pipelines pass, the EE repository requires a merge from the CE repository. This causes additional conflicts, pipeline failures, and similar delays requiring more manual intervention during which the CE distribution release is also delayed.

Between these three examples, days of engineering time has been spent on busy work, delaying the delivery of work that brings actual value. Only three examples are highlighted, but this type of work occurs daily. Whether writing a new feature available in Core, or any of the enterprise plans, all are equally affected.

What have we done to improve the situation?

We've invested significant development time to try and keep the two repositories separate:

Pre-2016: Manual merges for each release

Prior to 2016, merging the CE repository into the EE repository was done when we were ready to cut a release; the number of commits was small so this could be done by one person.

2016-2017: Daily merges by a team of developers

In 2016, the number of commits between the two repositories grew so the task was divided between seven (brave) developers responsible for merging the code once a day. This worked for a while until delays started happening due to failed specs or difficult merge conflicts.

Present: Further automation with Merge Train

By the end of 2018, the number of changes going into both the CE and EE repositories grew to thousands of commits in some cases, which made the automated MR insufficient. The Merge Train tool was created to automate these workflows further, by automatically rejecting merge conflicts and preferring changes from one repository over the other. The edge cases we've encountered are requiring us to invest additional time in improving the custom tool.

This last attempt turned out to be a bit of a crossroads. Do we invest more development time in improving the custom tooling, knowing that we will never get it 100 percent right, or do we need to take some more drastic measures that are going to save countless hours of development time?

What are we proposing?

One of GitLab's core values is efficiency. As previously mentioned, merging the gitlab-ce Rails repository into the gitlab-ee Rails repository is proving to be inefficient.

What are the possible downsides?

We want to be clear about the possible downsides of this approach:

Users with installations from source currently cloning the gitlab-ce repository would download from a new repository named gitlab. The clone will also fetch the proprietary code in /ee directory, but removing this directory has no effect on running application.

➡️ This is resolved by removing the /ee directory after cloning.

gitlab-ce distribution users would get more database tables because of the new tables in db/schema.rb. Database schema is open source and in the gitlab-ce distribution these new tables would not be populated, affect performance, or take significant space.

➡️ All database migration code is open source and does not add additional maintenance burden, so no additional work is required.

What's next?

We currently think that the efficiency gains and clearer naming outweighs these disadvantages. Our stewardship of GitLab is an important aspect of GitLab's success as a whole, so we would love to know:

Is there a better way to accomplish to solve the problem of the busy work?

What improvements can we make to our proposal?

Are there any additional considerations that we should take into account?

We invite you to share your suggestions in issue 2952, which was an inspiration for the proposal as it currently stands. We look forward to hearing your thoughts!