A Git Submodule Management Question

I have a jeremyfelt.com repository, similar to this on GitHub, that includes WordPress trunk as a submodule. I manage plugin and theme files outside of core WordPress files and from time to time do a git submodule foreach git pull, which grabs the latest changes from the WordPress repo. This all works beautifully (I think), but whenever I run git statusinside the master jeremyfelt.com repository, I get this:

I’ve gone several months just ignoring this and not ever staging anything. Today I decided to figure it out. From what I can tell there are at least two possibilities.

First – I stage and commit in the parent repository as a way of keeping track of what revision the submodule is on when changes are made to the parent. See – Stack Overflow discussion.

Or – I ignore the changes in the submodule (similar to what I’ve been doing) by adding ignore=dirty to my submodule definition in .gitmodules. See – How to ignore changes in submodules.

I’m leaning towards the second as my goals in the parent repository are not necessarily to track a specific version of the WordPress submodule, just to use it as a way to have trunk easily updated.

This indicates that there are two changes staged for commit. If you run “git diff –cached”, you can see those changes:

The first change is to .gitmodules. You have set up the repo path and the local path for the submodule. This information needs to be added to the repo so when others clone it, they have the submodule info too.

The second change is to a “file” called “wordpress” at the root of the directory (Note, this is not really a file. It is more of a “pointer”). This thing is crucial in tracking the hash for the submodule. In my screenshot, it shows that you are adding hash “2a4474f4de97670ef10c1b74153c37b4a0123438” to the “file”. When committing these changes, it forever locks the state of that submodule at that commit to the hash above.

.git/config is also updated, but that change is not pushed to other remotes, so it is not important for this purposes.

If you push that change, and someone else clones the repo and runs “git submodule update –init”, it will clone the submodule at that exact commit.

Now, imagine that you want to change the commit that you pointed to previously. For instance, in my example, I checked out “WordPress trunk”, but I would actually like to run my site on the latest stable WordPress. To do this, I would run something like this:

Now, if I run “git status” in the main repo, I will see:

What has changed? “git diff” shows us:

The hash stored in “wordpress” has changed. I would then run “git commit -am ‘Use WordPress 3.5.1′” and push the code.

If a team member ran “git pull” on the repo, one of the changes she would pull down would be the status of the submodule. The team member would run “git submodule update”, which updates her WordPress submodule to the new hash.

From your post, I think you might be confusing what “git submodule update” does. It does not preform “git pull” on the submodule; rather, it updates the submodule to hash stored in the “wordpress” file.

I think your workflow to update WordPress should be (you can do this with more brevity, but I’m being verbose for clarity):

cd wordpress
git status (shows the current status of the git submodule)
git pull
cd ..
git status (to see that you are updated to a new hash)
git commit -am “Upgrade WordPress”
git status (nothing dirty should show)

Honestly, I’m not really sure what “git submodule foreach git update” is supposed to do. “git update” is not a legit git command. Perhaps this was a mistake? “git submodule foreach git pull” seems to update the current version of the submodule, but I do not think you are ever committing the hash change back to the parent repo, which is giving you your problem.

“git submodule foreach git pull” is the real command. Updated the post to reflect that. This has been the method that I use to pull down the latest trunk from the submodule. Is it smarter to go into the submodule directory and do a `git pull`?

I think I was definitely confused on what untracked changes meant in this scenario. I was certain that adding the directory would stage all of the files and not just that pointer. This makes a ton more sense now.

Another illustration of this idea is to create a new branch (e.g., wordpress-2.7), and run “cd wordpress && git checkout 2.7 && git commit -am ‘Going old school'”. When you go back to master (“git checkout master”), you’ll get that stinkin’ “dirty” status again. To fix, you just run “git submodule update”. All that is doing is updating the submodule to the hash saved in the master branch.

Fascinating! I’m going to have to dig in and mess around with this a bunch more now. I like how you identified and answered the real question I didn’t know I had versus the one I posed in the post. 🙂 Thanks!