HOW-TO: Add live websites to GIT without interruptions or security risks

Over at the DEV site we’ve been using GIT repositories hosted at BitBucket.org to push (deploy) changes to the live server instantly. It’s way slicker than uploading changed files by FTP, which has no roll back if something goes wrong. It also solves one of our China internet problems: routing to Bitbucket is much more reliable than to our servers in Germany.

It is finally time to merge the DEV site with the main Dangerous Prototypes sites – the blog got a new theme yesterday. When we started updating the rest of the site it seemed much easier and safer to put it in GIT too. There are a lot of tutorials about deploying websites with GIT, but none completely covered the process to safely put existing websites into GIT with no interruptions and maximum security.

Updating folder structure

We’re starting off with a legacy folder structure rooted in early cloud services and personal ignorance. The web root is at /var/www/. Other parts of the site (/forum, /docs) sit in subfolders of the root:

This isn’t how modern websites are structured. It’s super amateur, but it’s worked for years. We’ll change this to use Apache virtual directories when the dev site is merged with this site.

Use an external repo folder for security

“git init” would create a new repo around the entire website giving instant version control. It would also put the git config folder and files on the live web leaving a big security hole! It also binds together the five different web apps running on this server into a single giant mess.

After creating a free GIT repo at BitBucket.org called “wordpress” and setting up the SSH keys we’re really to roll. We cloned the empty repo into our home folder (~) on the server, keeping all the repo files out of the reach from the web.

cd repo-wordpress
GIT_WORK_TREE=/var/www git status

The GIT_WORK_TREE variable links the live website directory (/var/www) with the new repo in our home folder (~/repo-wordpress). The status command will show untracked files in the /var/www directory.

At this point you might think we should protect the git folders and config files with .htaccess instead of including the annoying GIT_WORK_TREE constant with every GIT command. We could, but then all changes would need to include that security for the life of the website. Blocking access is a patch against a vulnerability, better to never have the vulnerability in the first place.

# Include these files in previously blocked directories
!wp-content/media/.htaccess

#other software in root
docs/*
forum/*

Not everything belongs in our repo. 8 years of blog images and cache files take up a ton of space in a repo and we don’t need them to push code updates. We do, however, want to keep the .htaccess file in the media folder that prevents users browsing the contents of that directory.

We also want to exclude the other sites mixed into the webroot (docs/,forum/). We’ll push them into their own repos later using the same process. Put these rules in the .gitignore file in the GIT_WORK_TREE folder (/var/www for us). This is a short example .gitignore based on this, see our complete ignore files for all the sites in the forum!

GIT_WORK_TREE=/var/www git status

Check file status again. Ignored directories should not be on the list.

Add, Commit, Push

GIT_WORK_TREE=/var/www git add .

GIT is now tracking these files.

GIT_WORK_TREE=/var/www git status

See a list of the files being tracked.

GIT_WORK_TREE=/var/www git commit -m”Initial commit”

Commit the files to the local repo. Now we have a snapshot of the site code.

GIT_WORK_TREE=/var/www git push

Push from local repo to the remote repo at BitBucket. The site snapshot is now also stored at BitBucket.

Pull hooks for automatic updates

BitBucket and GitHub have a ‘hook’ feature that loads a URL after every push to the repo. We setup BitBucket to load a “secret” webpage on the server that triggers a git pull command whenever we push an update to the master branch. ServerPilot has some more info and a nice script that we modified.

We want to end up with a structure where each area of the site (/forum,/docs,/blog) is an Apache virtual directory or symlink to a public subfolder inside a git repository. Instead of serving WordPress directly from the repo (/apps/wordpress/), we serve it from a subfolder inside the repo called “public” (/apps/wordpress/public).

First, this keeps the git folder and configuration files off web without the annoying GIT_WORK_TREE variable in each command. Second, we can use the main folder for other stuff that might be handy to have in the repository but shouldn’t be public: notes, database updates, sample files and data, test tools, etc.

We’re redeploying each site area like this as the new themes are finished. When the dust is settled we’ll document the final server setup that should last well into the future.