Dependencies on darcs

The following is intended to be a complete list of the things that would need to change if we were to switch away from darcs, in addition to the conversion of the repository itself, which I am assuming can be automatically converted using available tools.

The following code/scripts would need to be adapted or replaced:

The darcs-all script

The push-all script

The aclocal.m4 code that attempts to determine the source tree date

.darcs-boring

The buildbot scripts

checkin email script: /home/darcs/bin/commit-messages-split.sh

Trac integration (the GHC Trac does not currently integrate with darcs, however)

Plan for libraries

The remaining question is what to do about the library repositories. It is possible to work with the GHC repository in git and all the other repositories in darcs, but this can't be a long-term strategy: our motivation for moving away from darcs is invalid if parts of the repository still require darcs. We need a strategy for a single-VCS solution.

Here's a tentative plan:

Some libraries belong to GHC (template-haskell, ghc-prim, integer-gmp, hpc), and for these we can convert
the repos to git and keep them as subrepos. (alterantively we could just import them into the main
git repository for convenience).

Of the rest, base is somewhat special, because this alone often needs to be modified at the same time as GHC. We propose migrating base to a git repository.

For the rest of libraries (e.g. filepath, containers, bytestring, editline), GHC is just a client, and we don't expect to be modifying these libraries
often. Hence we can just copy the libraries wholesale into the GHC git repository, and update the copies
occasionally when a new version of the library is released. We can provide a way to update the GHC copy from
the official darcs repository easily. The local copy would be read-only, except when updating from the master copy.

The perspective on submodules

Submodules

Things that work:

when cloning a new repo, the submodules do point to the right place (the submodules of the parent)

git status shows when a submodule is "dirty" (has local changes or new commits)

git diff shows diffs in submodules too

git submodule status tells you which local submodules have changes (+ at the beginning of the line)

Gotchas:

after git pull, you need to do git submodule update

submodules are detached by default, so you must git checkout master before you can commit (you don't find out until you push)

git submodule update detaches the local submodules from whatever branch they were on. So if you had done git checkout master and committed local changes, the local changes are now invisible (but still stored in the repo). Alternatively, you can use git submodule update --merge or git submodule update --rebase. Neither seem like a good default.

if you had local uncommitted changes in a submodule, then git submodule update refuses to update the submodule. Then your repo is in a state where it appears you have a local change to the submodule, this could be confusing.

need to git submodule init before you can git submodule update in a new tree (or use git submodule update --init)

have to push to submodules before pushing GHC, otherwise other users will not be able to do git submodule update.

every submodule commit needs to be accompanied by a GHC commit (not clear if this is really a disadvantage, but it's more work and there will be many more commits).

Google repo

Google has a tool called ​repo that they use for managing the Android repositories, which is basically the same as our darcs-all script but is much much larger (it probably does a bit more, to be fair). It is written in Python and the list of git repositories is kept in an XML file.

Older comments

Submodules do not really seem to be designed for what we want to do (work on a cohesive set of components that are developed together): they seem more suited to tracking upstream branches that you do not modify locally.