You are here

Planet Drupal

Creating topic branches (also called "feature branches") is easy in Git and
is one of the best things about Git. Sometimes we also want to get those pushed
from our local repo for various reasons:

To make sure it's safe on another server (for backup).

To let others review it.

To let others build upon it.

Here we'll just be dealing with #1 and #2, and not talking about how to collaborate
on a shared branch. We'll be assuming that only the original author will push to it,
and that nobody else will be doing work based on that branch.

Back in the bad old days (like 2 weeks ago) there was exactly one way to create patches and exactly one way to apply them. Now we have so many formats and so many ways to work it can be mind boggling. As a community, we're going to have to come to a consensus on which of these many options we prefer. I'm going to take a quick look at the options and how to work among them when necessary.

Ways to create patches

My preference when making changes is always to commit my changes whether or not the commits will be exposed later. That lets me keep track of what I'm doing, and make commit comments about my work. So in each if these cases we'll start with a feature branch based on origin/7.x-1.x with:

First, I may want to know what commits and changes I'm going to be including in this patch.

git log origin/7.x-1.x..HEAD shows me the commits that I've added.

git diff origin/7.x-1.x shows me the actual code changes I've done.

Now we can create a patch representing these changes in at least a couple of ways:

git diff origin/7.x-1.x >~/tmp/hypothetical.my_feature_99999_01.patch will create a patchfile which can be uploaded to Drupal.org.

git format-patch --stdout origin/7.x-1.x >~/tmp/hypothetical.my_feature_99999_01.patch will create a patchfile that includes sophisticated commit information allowing a maintainer to just apply the patch and fly - the commits you've created will automatically be applied (more later). Note that this type of patch has the email you used with git config user.email embedded in the patch, so if you post it on Drupal.org, it will be indexed by Google.

What do we do with the feature branch? Whatever. It's sitting there and can be used any number of ways in the future, or you can delete it. I tend to clean these up periodically, but not right away. git branch -D my_feature_99999 would delete this branch.

Ways to apply patches

I usually create a feature branch to apply and work with patches. This lets me make edits after the fact and have complete freedom. I don't have to keep track of what I've committed until I'm done. So in this case let's assume that I'm the maintainer and have received the patch created above.

Before we start, pleasegit config --global push.default tracking
which will allow a tracking branch to automatically push to its remote.

git checkout -b received_patch_99999/03 origin/7.x-1.x

This creates a named local branch which can push to the 7.x-1.x branch on the server.

I can then continue to work on this, but now I can differentiate my own edits from the original patch that was provided. I might made additional commits. As a maintainer, I can then rebase/squash and commit a nicely-formatted single commit at the end. (See below.)

git apply /path/to/patchfile.patch does exactly the same thing as patch -p1 so you can use the exact same workflow.

git am /path/to/patchfile.patch only works with patches created using git format-patch that contain commit information. But when you use it, it actually makes the exact commits (with the associated commit messages) that the original patcher made. You can then continue to make changes yourself, make additional commits, and then rebase/squash and commit a nicely formatted version of the patch (see below).

I could now push the commits with

git push # If you have push.default setORgit push origin HEAD:7.x-1.x

Squashing, rebasing, and editing the commit message

Let's say that this patch works, we've committed whatever edits we want, and it's time to go ahead and fix it up and push it. Now we can rebase/squash it, fix up the commit message, and push the result.

Some things to know here: Although you may have heard that "rebasing is bad" it's not true. Rebasing commits that have already been made public and that might affect someone else is in fact bad. But here we have not done that. We are just using Git's greatest features to prepare a clean, single commit. It will not break anything, it will not rewrite anybody else's history.

Let's squash our all of our work into a single commit that has a good message: "Issue #99999 by user999: Added A.txt and B.txt"

# Make sure our local branch is up-to-dategit pull --rebase# Now rebase against the branch we are working against.git rebase -i origin/7.x-1.x

We'll have an editor session pop up with a rebase script specified, something like this:

pick a5c1399 added A pick d3f45f7 Added B

# Rebase c98c91a..d3f45f7 onto c98c91a # # Commands: # p, pick = use commit # r, reword = use commit, but edit the commit message # e, edit = use commit, but stop for amending # s, squash = use commit, but meld into previous commit # f, fixup = like "squash", but discard this commit's log message # # If you remove a line here THAT COMMIT WILL BE LOST. # However, if you remove everything, the rebase will be aborted. #

To squash, all we have to do here is change the second and following commands of this little script from "pick" to "s" or "squash". Leave the top one as 'pick' and everything will be squashed into it. I'll change mine like this:

pick a5c1399 added As d3f45f7 Added B

and save the file and exit.

Then you get the chance to edit the commit message for the entire squashed commit. An editor session pops up with:

# This is a combination of 2 commits. # The first commit's message is:

added A

# This is the 2nd commit message:

Added B

# Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # Not currently on any branch. # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # new file: A.txt # new file: B.txt

At this point, everything in the commit message that starts with '#' will be a comment. You can just edit this file to have the commit that you want. I'll delete all the lines and put

Issue #99999 by user999: Added A.txt and B.txt

Now I can just

git push # If push.default is set to trackingORgit push origin HEAD:7.x-1.x

Note: A maintainer who never gets lost in what they're doing and always finishes things sequentially doesn't absolutely need to create a new branch for all this. Instead of creating a branch with

git checkout -b received_patch_99999/03 origin/7.x-1.x

we could have done all this on a 7.x-1.x tracking branch. The only complexity is that we have to figure out what commit to rebase to, which we can figure out with git log

Rebasing to prepare a nicer single-commit patch

OK, so let's assume that the maintainers of this project ask for better-prepared patches because they don't care to rebase and never edit a commit, but rather just apply them. The patch contributor can do the exact same squashing process before creating the patch.

There were two commits on my feature branch, and they have my typical, lousy, work-in-progress commit messages that I wouldn't want any maintainer to have to deal with. So I'll rebase/squash them into a single commit before creating my patch.

git rebase -i origin/7.x-1.x

Then I get the same exact options that the maintainer had in the section above, and follow the exact same procedure to consolidate them, and give a good commit message. Then,

will make a very nice single-commit patch with a good message that the maintainers can use out of the box if they'd like to.

Summary

We have lots of options in creating and applying patches, but the git format-patch + git am + git rebase -i toolset is remarkably powerful, and we may be able to build a community consensus around this toolset.

Rebasing sounds hard and odd, and squashing really awful, but they're essentially the same thing as patching. In reality, they bring the patch workflow into the 21st gitury. One patch == one commit. Sure you can do lots of obscure things with them, but here we're just combining and cleaning up commits.

And yes, you've been warned not to rewrite publicly exposed history using rebase, but we're not doing that. We're preparing something to be made public so it's completely harmless.

Even if you've just arrived into the Gitworld, you've already noticed that things are really fast and flexible. Some people claim that the most important thing about the distributed nature of Git is that you can work on an airplane, but I claim that's totally bogus. The coolest thing about the distributed nature of Git is that you can fool around all you want, and try a million things, and even mess up your repo as much as you want... and still recover. Here's how, with a screencast at the bottom.

With CVS or Subversion or any number of other VCS's, when you commit or branch, you have the potential of causing yourself and others pain forever. And there's no way around it. You just have to do your best and hope it comes out OK. But with Git, you can try anything and rehearse it in advance, in a total no-consequence environment. Here are some suggestions of where to start.

Setting up a git-play area

First, let's pretend that we're a committer of Examples project. To do this, we don't have to have any privileges. Let's just make a copy of the Examples repo that we do have privileges on. We'll do this by creating a copy of the repository on Drupal.org in a throw-away directory:

Now we have a repo where we can do anything we want, experiment with anything, and it can never get back to the server, even though we have full push access. There is no way you can do anything wrong using this setup - you're able to commit, but you're committing to a local clone of Examples project.

Note that you could also just cp -r a local repo to /tmp or some similar junk play area. You just have to be careful in that case because if you had commit access in the original repo, you do also in the copy. The reason I did the mirror clone above was to give us commit access to a bogus local repository with no consequences.

Hard Reset

git reset --hard <commit>
Sometimes you just want to give up your work (or redo it, or just replay it from the authoritative repository. Then git reset --hard is what you want. You can throw away one or several commits, blow away changes you have staged, etc. If the commits in question are still on another branch (or on your branch in the remote repository), then you're not even doing anything destructive, but just fiddling with your local.

So first, let's experiment with what happens when we use the destructive git reset --hard, which sets the repository back to the commit we name. Let's set it back 3 commits:

git reset --hard HEAD~3git log

We have just destroyed 3 commits! But did we do any damage? Nope.

git pull

pulls it right back into our local repository. All we actually did was to remote the memory of those 3 commits from our local branch.

Or let's say that those commits were not in the remote repo. We can still do all this with no risk:

git checkout mastergit checkout -b play_branchgit reset --hard HEAD~3

# Recover the commits from our original branchgit merge master

Resetting a commit so we can fix it up a bit

Let's say that all these commits are my own and they haven't been released into the wild yet. I'm going to rework the top commit just a bit because I didn't really like it. I essentially throw away the commit, but keep the files in my work tree.

git reset HEAD^

will undo the top commit, but leave the results of it in the working tree.

Amending a commit

Sometimes I have either messed up the commit message, forgotten to stage a file or a file deletion, or something of the like. It's just good to have another chance at the commit.

I can stage some additional stuff I want to commit just by doing a git add and then

git commit --amend

and the newly staged stuff gets added to the top commit, and I have the chance to change the commit message as well.

Yes, this is rewriting history, and it must be done only on commits that have not already been released into the wild.

Combining (squashing) 10 commits into one

Since we're playing let's combine 10 commits into one. This uses the rather exotic "rebase" command to do the exotic "squash" operation. But even though these sound forbidding, it's just a powerful way to combine many commits into one:

git rebase -i HEAD~10

We get the chance to turn the last 10 commits into any number of commits, or squash them into 1. To turn it all into one commit, change lines 2-10 from "pick" into an "s" (for "squash").

Cherry-picking

Let's make a new branch, go back in history by 10 commits, and then cherry pick some of the original commits that were on this branch back onto it. It's easy.

git log # Take a look at the commitsgit checkout -b my_fiddle_branch # Make a play branchgit reset --hard HEAD~10 # Kill off the top 10 commitsgit cherry-pick master^ # Take the next-to-top commit on master and apply it on this branch

You probably know that all Drupal core patches and commits get tested using the simpletest testing system, but you may not know how. Essentially, every time a patch is posted and placed in the "Needs Review" status on Drupal.org, and every time a commit is made, the information is sent off to qa.drupal.org, which then farms it out to testing working machines we call testbots.

Well, the testbots are quite compute-intensive, and although we tried for years we never really got enough decent, manageable machines from community contributions. Partly this is because the machines are really compute-intensive, and partly it's because they can be pretty tweaky and not everybody wants to learn how to manage them and deal with their occasional fits. Many community members had donated time on their machines, but in recent months we'd been using Amazon EC2 instances for the bulk of our testbots, and the bill to the Drupal Association was more than we'd like to pay, more than $500 many months.

Enter the OSUOSL Supercell. You may not know the OSUOSL (Oregon State University Open Source Lab) but they're the folks that handle all the hardware side of the Drupal.org infrastructure. As yet another of their great services to the open source communities they serve, the Supercell is a cluster of physical machines turned into a virtualization environment for projects like ours. So instead of either scrounging around for spare machines from the community or paying Amazon for super-high-powered EC2 instances, we now have 3 full-time testbots (plus one for testing PIFR code and deployment) running in the Supercell. (You can see their status and what they're working on any time at http://qa.drupal.org.) It's been reliable and it's so nice to have these very appropriate testing machines there.

As they roll out the Supercell, there will probably be other opportunities to use these, for both Drupal and other communities. They're behind a firewall, which requires proxying to get to them, so they're not appropriate for any service that needs to be directly internet-reachable. However, they're absolutely wonderful for services like the testbots (which call home for their tests), or for build machines, etc.

Thanks so much to Jeff Sheltren, Lance Albertson, and the rest of the crew at OSUOSL for making this possible. It's a huge win for the Drupal community.

Situation: An existing site has had its theme hacked in place, or just has a stock theme deployed and I need a subtheme. But that means that I'm going to have to change the name of the theme, which can mean having to go back and do all the theme and block settings again. I really don't like manual work, so this time I tried to write down what has to be done to create a new theme with the old theme's settings and block settings.

I did this on a Drupal 6 site, but I believe the basics are the same for any Drupal version. Edit (26 May 2011): Drupal 7 block table changed its name and one field, so a recipe for it is at the bottom of the article.

Create your new theme or subtheme. Our theme's machine name will be 'newtheme'. The theme we're replacing (which has the same regions and behavior) will be 'oldtheme'.

Drupal 7 and Drupal 8 recently added a default (and sensible) .gitignore file to the standard repository, and while this solves some problems, it has also caused some confusion. (issue)

Here's a link to the actual new .gitignore. Essentially, it excludes the sites/default/files and sites/default/settings.php files from git source control.

What problems does having a default .gitignore solve?

The biggest problem it solves is that patches submitted to drupal.org were accidentally including things that they never should have included (like people's settings.php files) We just don't need that information, thank you very much.

It also sets a "best practice" for not source-controlling your files directory. Since the files directory is website-generated or user-generated content, it doesn't make any sense to put that in your git repository; most people have long come to a consensus on this, although not all agree.

What problems does having a default .gitignore create?

Mostly the problems created have to do with developer workflow.

A dev site may contain lots of deliberately uncontrolled modules or themes or libraries.

A site may want a completely different .gitignore due to various policy differences from the default.

How do I solve these problems?

Lots of ways:

If you don't want sites/all to be controlled at all (you want to ignore all modules and themes and libraries), add a file at sites/all/.gitignore with the contents a single line containing nothing but *

Simply change the .gitignore and commit the change. You won't be pushing it up to 7.x right?

If you track core code using downloads (and not git) you can simply change the .gitignore and check it into your own VCS.

Add extra things into .git/info/exclude. This basically works like .gitignore (it has good examples in it) and is not under source control at all.

Add an additional master gitignore capability with git config core.excludesfile .gitignore.custom and then put additional exclusions in the .gitignore.custom file.

Note that only 1 and 2 are completely source-controlled. In other words, #3, 4, and 5 would have a slight bit of configuration on a deployment site to work correctly, but they work perfectly for a random dev site.

How do I exclude things from Git that should always be excluded (Eclipse .project files, Netbeans nbproject directories, .orig files, etc.)?

You can have a global kill file. I use ~/.gitignore for things that I don't ever want to see show up as untracked files. You can activate that with git config --global core.excludesfile ~/.gitignore

Executive summary: Set up to automatically use a git pull --rebase

Please just do this if you do nothing else:

git config --global branch.autosetuprebase always

About rebasing and pulling

I've written a couple of articles on rebasing and why to do it, but it does seem that the approaches can be too complex for some workflows. So I'd like to propose a simpler rebase approach and reiterate why rebasing is important in a number of situations.

Here's a simple way to avoid evil merge commits but not do the fancier topic branch approaches:

Go ahead and work on the branch you commit on (say 7.x-1.x)

Make sure that when you pull you do it with git pull --rebase

Push when you need to.

That's it.

Reasons to rebase to keep your history clean

There are two major reasons not to go with the default "git pull" merge technique.

Unintentional merge commits are evil: As described in the Git disasters article, doing the default git pull with a merge can result in merge commits that hide information, present cryptic merge commits like "Merge branch '7.x' of /home/rfay/workspace/d7git into 7.x" which explain nothing at all, and may contain toxic changes that you didn't intend. There is nothing wrong with merges or merge commits if you intend to present a merge. But having garbage merges every time you do a pull is really a mess, as described in the article above.

Intentional merge commits are OK if you really want that branch history there: If a significant, long-term piece of work has gone on and should be shown as a branch in the future history of the project, then go ahead and merge it before you commit, using git merge --no-ff to show a specific intentional merge. However, if the work you're committing is really a single piece of work (the equivalent of a Drupal.org patch), then why should it show up in the long-term history of the project as a branch and a merge?

I'll write again about merging and rebasing workflows, but for now we're just going to deal with #1: The case where you share a branch with others and you want to pull in their work without generating unintentional merge commits.

How to use git pull --rebase

The first and most important thing if you're a committer working on a branch shared with other committers is to never use the default git pull. Use the rebase version, git pull --rebase. That takes your commits that are not on the remote version of your branch and reworks/rebases them so that they're ahead of (on top of) the new commits you pull in with your pull.

Automatically using git pull --rebase

It's very easy to set up so that you don't ever accidentally use the merge-based pull. To configure a single branch this way:

git config branch.7.x.rebase true

To set it up so every branch you ever create on any repository is set to pull with rebase:

git config --global branch.autosetuprebase always

That should do the trick. Using this technique, no matter whether you use techniques to keep your history linear or not, whether you use topic branches or not, whether you squash or not, you won't end up with git disasters.

Bottom line: No matter what you do, please use git pull --rebase. To do that automatically forever without thinking,

Drupal has a long tradition of insisting that everybody's contribution is equal, that every piece of content is equal. We have to stop that.

It's nearly impossible to find the critical content on d.o and has been for a very long time. How do we fix this? Enlist the community.

Here are some broad brushstrokes:

Differentiate content.

Use flag module to mark content as useful, mark it as spam, mark comments as issue summaries, mark modules and themes as "I use this" and lots of other things.

Curate more content. Can't we have articles on drupal.org promoted to the planet?

Allow comments on modules and themes.

Differentiate users. There are plenty of ways to do this without ranking users against each other or giving numbers to users. A module maintainer badge? A 5-year member badge? Infrastructure team badge?

Find a leader (or pair of leaders) whose job is to promote the quality of the content of drupal.org. Right now, we tend to make decisions on things only based on infrastructure requirements, but we need to be thinking about content and usability far more than infra and performance. (I'm not saying that we shouldn't pay attention to performance, just that having a functional, usable site is far more important than the underlying infrastructure.)

Triggers and actions are a core feature of Drupal and have been for a really long time. But lots of people don't know that they can actually be useful for some very basic and important tasks. You might say this is a beginner topic, but since I've only mastered it for my own use in the last year or so, I figured it was worth writing about :-)

I use them mostly for notifications when comments have been posted to my blog, where comments are 100% moderated. (I use the wonderful Comment Notify to notify commenters when more comments are posted. If spam got posted they'd get email spam.)

So among the many things you can do simply with triggers and actions is configuring a notification to the site owner with details about the posted comment, and that's a great example of use of this simple feature. Here's how to do it.

First, create an action that does what you want.

In Drupal 7 it's easy to use tokens to customize the details of the outgoing email. At Administration -> Configuration -> System -> Actions (admin/config/system/actions) we can select "Send e-mail..." from the dropdown.

Change the label to something like "Comment notification email"

Set the recipient and subject (I use "Comment [comment:title] by [comment:author]" for the subject)

Now set assign the action to a trigger

Oddly, the triggers configuration is in a completely different place in the menu hierarchy, and always has been. You can find it at admin/structure/trigger. We'll select the "Comment" category of triggers, and under "After saving a new comment" select the "Comment email" action we just created and save.

That's all there is to it. This is Drupal 7; the tokens are different than in earlier versions of Drupal, but the ideas are the same. If you want more power and the ability to apply conditions to your behavior, check out the excellent Rules module.

Update: D6 Recipe

For Drupal 6 the idea is the same. You need to enable the trigger and the token_actions module (which comes with the contrib Token module).

Create an advanced action that is a "tokenized e-mail". The title can be something like "Comment: [comment-node-title]" and the mail can be your email.