[[PageOutline]]
= Gitification of Indymedia linksunten =
== Introduction ==
Until May 2012,[https://linksunten.indymedia.org/ Indymedia linksunten] used the Good Old Fashioned Way™ to keep track of upstream changes to its [http://drupal.org/ Drupal] (in fact: [http://pressflow.org/ Pressflow]) core and modules: [http://drupal.org/project/drush drush] dl mymodule. At least in theory. In reality, the core has been patched twice, many modules even more and some self-written modules do not even exist in a public drupal repository. linksunten has some 80 modules installed and keeping track of updates is wearisome for the non-patched modules and troublesome for the patched ones. We learned that a version control system could ease the error-prone update procedure and as drupal.org has [http://drupal.org/community-initiatives/git migrated] to [wiki:git git] we decided to do the same. Beforehand, we evaluated [http://mercurial.selenic.com/ mercurial] and [http://bazaar.canonical.com/ bazaar] and from a technical point of view we could have chosen all three.
Since a Drupal website is not a monolithic bloc and nearly each module is maintained by different developers we needed to find a way to update the core and each module separately from one another. The traditional way git offers for this is a concept called [http://git-scm.com/book/en/Git-Tools-Submodules git-submodule]. It is complicated, unintuitive and detested by many for good reasons. But as git follows the [http://docstore.mik.ua/orelly/weblinux2/modperl/ch13_01.htm TMTOWTDI] paradigm we could avoid using git-submodule and settled for [https://github.com/apenwarr/git-subtree git-subtree] instead which has recently been [https://github.com/apenwarr/git-subtree/commit/8cd698957f57f62ec3d313403bebced2fdf751e5 merged] into git core. Besides the possibility to update the core and each module separately and replaying our patches to the updated version automatically, we want the linksunten code to be one git repository which simply "works" after cloning it. After our move to git we will use the [http://drupal.org/project/features features] and [http://drupal.org/project/ctools ctools] modules to version control as much of our configuration data as possible.
== Drupal installation ==
=== Drupal core ===
We create a new directory, run [http://git-scm.com/docs/git-init git init], use [http://git-scm.com/docs/git-config git-config] to specify which name and email to use, create a temporary file which we commit, delete and commit again to create a master branch:
{{{
mkdir liu_d6
cd liu_d6
git init
git config user.name "Indymedia linksunten"
git config user.email "línksunten@índymedía.org"
touch liu_d6
git add liu_d6
git commit -m "Initial commit Indymedia linksunten Drupal 6."
git rm liu_d6
git commit -m "Created master branch."
}}}
We don't want the settings.php file in our repository as it contains database login credentials so we tell git to exclude it:
{{{
echo settings.php >> .git/info/exclude
}}}
We add Pressflow 6 as a new remote and fetch it:
{{{
git remote add pressflow-6.x git://github.com/pressflow/6.git master
git fetch pressflow-6.x
}}}
Now we can add Drupal core from our pressflow-6.x remote via git-subtree in a subdirectory core. We use the --squash parameter as we do not need the whole commit history of Pressflow in our master branch:
{{{
git subtree add --squash --prefix=core pressflow-6.x/drupal-6.26
}}}
As we are creating a production environment, we'll delete some of the unnecesary files:
{{{
git rm core/install.php core/*.txt
git commit -m "Delete install.php and text files from core directory."
}}}
We copy our settings.php to core/sites/default and delete the default.settings.php:
{{{
cp ~/settings.php core/sites/default
git rm core/sites/default/default.settings.php
git commit -m "Deleted default.settings.php."
}}}
There are some more modifications to do (like [http://git-scm.com/book/en/Distributed-Git-Maintaining-a-Project applying] two core patches using [http://git-scm.com/docs/git-am git-am] which have been [http://git-scm.com/book/en/Distributed-Git-Contributing-to-a-Project created] with [http://git-scm.com/docs/git-format-patch git-format-patch] before) but they are really specific to Indymedia linksunten so we leave them out.
=== Drupal modules ===
Normally, Drupal modules are installed under sites/default/modules. This would be fine with our approach but it would create unnecessary huge merges when updating the core and it keeping all parts separately accessible from the root directory of our installation is much clearer arranged. So we create a modules (and perhaps also a files, libraries and themes) directory, a symbolic link to it and add commit the link:
{{{
mkdir modules
cd core/sites/default
ln -s ../../../modules
cd ../../../
git add core/sites/default/modules
git commit -m "Add symbolic link to modules directory."
}}}
Now we install a module in it. As an example we chose the [http://drupal.org/project/i18n i18n] module. At the Drupal project page we click on the green "Version Control" tab and chose "Version to work from: 6.x-1.x". There we get the URL which we need to add the repository as a remote repository. The parameter -f triggers an instant fetch:
{{{
git remote add -f i18n-6.x-1.x git://git.drupal.org/project/i18n.git 6.x-1.x
}}}
Now we do not install the latest version 6.x-1.10 but version 6.x-1.9. The reason is that we have patched that version and we want to use git-subtree and [http://git-scm.com/book/en/Git-Branching-Rebasing git-rebase] to reapply our patches to the newest version. First, we install 6.x-1.9:
{{{
git subtree add --squash --prefix="modules/i18n" 6.x-1.9
}}}
Then we overwrite the newly imported files with our patched version and commit the patches. At this point, [http://git-scm.com/book/ch6-2.html interactive staging] might be a good idea.
{{{
cp ~/i18n.pages.inc modules/i18n
cp ~/i18nsync.module modules/i18n/i18nsync
git add modules/i18n/i18n.pages.inc
git commit -m "i18n: Exchange title with nid in translation box."
git add modules/i18n/i18nsync/i18nsync.module
git commit -m "i18n: Inherit path when syncing."
}}}
== Drupal update ==
We are going to update a patched Drupal module using [https://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt git-subtree] and [http://git-scm.com/docs/git-rebase git-rebase]. We create a new [http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging branch] containing the version of our module we want to update to. As we have already added i18n as a [http://git-scm.com/docs/git-remote git-remote] we can [http://git-scm.com/docs/git-checkout git-checkout] (as a shortcut for [http://git-scm.com/docs/git-branch git-branch]) the desired version in a separate branch using the corresponding [http://git-scm.com/book/en/Git-Basics-Tagging tag]. As long as there are no [http://git.661346.n2.nabble.com/1-8-0-Remote-tag-namespace-td5980437.html separate namespaces for remote tags] in git we definitely want to start with [http://git-scm.com/docs/git-fetch git-fetch] to be sure to refer to tags of the right module:
{{{
git fetch i18n-6.x-1.x
git branch i18n-6.x-1.10 6.x-1.10
}}}
Then we extract the patched version of our module into a branch along with its history:
{{{
git subtree split --rejoin --prefix=modules/i18n --branch=i18n-linksunten
}}}
Now we can [http://git-scm.com/docs/git-rebase git-rebase] our branch on top of the new version of the module:
{{{
git rebase i18n-6.x-1.10 i18n-linksunten
}}}
Finally, we have to subtree merge the patched new version into our master branch:
{{{
git checkout master
git subtree merge --squash --prefix=modules/i18n i18n-linksunten
}}}
After that, you can delete the two branches:
{{{
git branch -D i18n-6.x-1.10 i18n-linksunten
}}}
The same process can be applied to update the core. Hopefully, you did not need to patch the core (as we did) so [http://git-scm.com/docs/git-fetch git-fetch] will result in [http://git-scm.com/book/en/Git-Branching-Basic-Branching-and-Merging fast-forward merges].
== hook_system_info_alter ==
In the past, drupal.org used [http://www.nongnu.org/cvs/ CVS] as version control system and switched to git only recently. Unfortunately, not all module maintainers have adapted their code base to the new revision control system manually. Instead, the Drupal team migrated lots of projects automatically. So at least at the moment, you'll discover that release version of many modules are not tagged at all.
Another problem is the available updates page at ''/admin/reports/updates'' and the version information obtained by [http://api.drush.org/api/drush/commands!pm!pm.drush.inc/function/drush_pm_list/5.x drush pm-list]. When checking out a module via git no version information is added by the drupal.org package manager. Enter [http://drupal.org/project/git_deploy git_deploy]. More precisely ''commit 68bd1a8219cbe59e7fbe56b317a600321116ddfa'' from ''Thu Apr 26 10:15:16 2012 -0700''.
=== git_deploy 6.x-2.x ===
{{{
filename);
// Check whether this belongs to core. Speed optimization.
if (substr($directory, 0, strlen($type)) != $type) {
while ($directory && !is_dir("$directory/.git")) {
$directory = substr($directory, 0, strrpos($directory, '/'));
}
$git_dir = "$directory/.git";
// Theoretically /.git could exist.
if ($directory && is_dir($git_dir)) {
$git = "git --git-dir $git_dir";
// Find first the project name based on fetch URL.
// Eat error messages. >& is valid on Windows, too. Also, $output does
// not need initialization because it's taken by reference.
exec("$git remote show -n origin 2>&1", $output);
if ($fetch_url = preg_grep('/^\s*Fetch URL:/', $output)) {
$fetch_url = current($fetch_url);
$project_name = substr($fetch_url, strrpos($fetch_url, '/') + 1);
if (substr($project_name, -4) == '.git') {
$project_name = substr($project_name, 0, -4);
}
$info['project'] = $project_name;
}
// Try to fill in branch and tag.
exec("$git rev-parse --abbrev-ref HEAD 2>&1", $branch);
$tag_found = FALSE;
if ($branch) {
$branch = $branch[0];
// Any Drupal-formatted branch.
$branch_preg = '\d+\.x-\d+\.';
if (preg_match('/^' . $branch_preg . 'x$/', $branch)) {
$info['version'] = $branch . '-dev';
// Nail down the core and the major version now that we know
// what they are.
$branch_preg = preg_quote(substr($branch, 0, -1));
}
// Now try to find a tag.
exec("$git rev-list --topo-order --max-count=1 HEAD 2>&1", $last_tag_hash);
if ($last_tag_hash) {
exec("$git describe --tags $last_tag_hash[0] 2>&1", $last_tag);
if ($last_tag) {
$last_tag = $last_tag[0];
// Make sure the tag starts as Drupal formatted (for eg.
// 7.x-1.0-alpha1) and if we are on a proper branch (ie. not
// master) then it's on that branch.
if (preg_match('/^(' . $branch_preg . '\d+(?:-[^-]+)?)(-(\d+-)g[0-9a-f]{7})?$/', $last_tag, $matches)) {
$tag_found = TRUE;
$info['version'] = isset($matches[2]) ? $matches[1] . '.' . $matches[3] . 'dev' : $last_tag;
}
}
}
}
if (!$tag_found) {
$last_tag = '';
}
// The git log -1 command always succeeds and if we are not on a
// tag this will happen to return the time of the last commit which
// is exactly what we wanted.
exec("$git log -1 --pretty=format:%at $last_tag 2>&1", $datestamp);
if ($datestamp && is_numeric($datestamp[0])) {
$info['datestamp'] = $datestamp[0];
}
// However, the '_info_file_ctime' should always get the latest value.
if (empty($info['_info_file_ctime'])) {
$info['_info_file_ctime'] = $datestamp[0];
}
else {
$info['_info_file_ctime'] = max($info['_info_file_ctime'], $datestamp[0]);
}
}
}
}
}
}}}
=== Analysis of git_deploy ===
Version 1.x of git_deploy was based on [https://github.com/patrikf/glip glip], a Git Library In PHP. Version 2.x of git_deploy calls the git executable directly and parses the output instead. There might be issues in a shared hosting environment but many people report that the 2.x version works far better than the 1.x version, so we'll adapt the idea of git_deploy 2.x to git-subtree.
Let's analyse what git_deplploy does. The module implements only one hook: [http://api.drupal.org/api/drupal/developer!hooks!core.php/function/hook_system_info_alter/6 hook_system_info_alter]. With this hook the module info obtained through git can be induced.
git_deploy searches for the ''.git'' directory of the module, so it will only work for git-submodules:
{{{
while ($directory && !is_dir("$directory/.git")) {
$directory = substr($directory, 0, strrpos($directory, '/'));
}
}}}
git_deploy then uses the fetch url obtained by {{{git remote show -n origin}}} to determine the project name:
{{{
exec("$git remote show -n origin 2>&1", $output);
if ($fetch_url = preg_grep('/^\s*Fetch URL:/', $output)) {
$fetch_url = current($fetch_url);
$project_name = substr($fetch_url, strrpos($fetch_url, '/') + 1);
if (substr($project_name, -4) == '.git') {
$project_name = substr($project_name, 0, -4);
}
$info['project'] = $project_name;
}
}}}
This approach won't work for git-subtree as there is no mapping of modules to remotes and remotes are not exported. So a [http://git-scm.com/docs/git-clone git-clone] won't know the original fetch urls used for [https://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt git-subtree-add]. Fortunaltely, drupal.org uses well-defined fetch urls so we can reconstruct the information by sanning for .modules files. But the process will be much more complicated and time-consuming with git-subtree than it is with git-submodule as we would either have to keep the module's history in a separate branch, incorporate the whole history by not using the --squash parameter, use [http://git-scm.com/docs/git-ls-remote git-ls-remote] to search git.drupal.org or (temporarily) [http://git-scm.com/docs/git-clone git-clone] the module.
Keeping the whole history in a separate branch would work but this is a fragile approach because we'd lose the liberty to mess around with our repository which is one of the key advantages of the git-subtree approach compared to the git-submodule one. We do want to use the --squash parameter to keep the overall size of the git repo small and the git history uncluttered. git-ls-remote is too slow when having lots of modules installed and searching through remote git repositories does not feel right at all. So the only sensible way seems to be to clone the repository and determine the necessary info locally.
=== Outline of git_subtree ===
One way to solve the problem would be a Drupal module which uses the [http://api.drush.ws/api/drush/docs!drush.api.php/5.x Drush api] to clone the git repository of each Drupal module locally and then determines the version information by using the ''git-subtree-split'' line returned by [http://git-scm.com/docs/git-log git-log]. It could write this information to a ''mymodule.gitinfo'' file. A module could then retrieve this information from the ''mymodule.gitinfo'' file and a [http://api.drupal.org/api/drupal/developer!hooks!core.php/function/hook_system_info_alter/6 hook_system_info_alter] could induce it into the local Drupal ecosystem. By adding {{{*.gitinfo}}} to [http://git-scm.com/book/en/Git-Basics-Recording-Changes-to-the-Repository ''.gitignore''] we could be sure that these files do not cause any problems when syncing with upstream repositories.