Git and WordPress: How to Auto-Update Posts with Pull Requests

At Bitfalls.com, we also use WordPress for now, and use the same peer review approach for content as we do at SitePoint.

We decided to build a tool which automatically pulls content from merged pull requests into articles, giving us the ability to fix typos and update posts from Github, and see the changes reflected on the live site. This tutorial will walk you through the creation of this tool, so you can start using it for your own WordPress site, or build your own version.

The Plan

The first part is identifying the problem and the situation surrounding it.

we use WPGlobus for multi-language support, which means content gets saved like this: {:en}English content{:}{:hr}Croatian content{:}.

authors submit PRs via Github, the PRs are peer reviewed and merged, and then (currently) manually imported into WP’s Posts UI through the browser.

every post has the same folder layout: author_folder/post_folder/language/final.md

this is slow and error prone, and sometimes mistakes slip by. It also makes updating posts tedious.

--ADVERTISEMENT--

The solution is the following:

add a hook processor which will detect pushes to the master branch (i.e. merges from PRs)

the processor should look for a meta file in the commit which would contain information on where to save the updated content

the processor automatically converts the MD content to HTML, merges the languages in the WPGlobus format, and saves them into the database

Bootstrapping

If you’d like to follow along (highly recommended), please boot up a good virtual machine environment, install the newest version of WordPress on it, and add the WPGlobus plugin. Alternatively, you can use a prepared WordPress box like VVV. Additionally, make sure your environment has ngrok installed – we’ll use that to pipe Github hook triggers to our local machine, so we can test locally instead of having to deploy.

Hooks

For this experiment, let’s create a new repository. I’ll call mine autopush.

In the settings of this repository, we need to add a new hook. Since we’re talking about a temporary Ngrok URL, let’s first spin that up. In my case, entering the following on the host machine does the trick:

ngrok http homestead.app:80

I was given the link http://03672a64.ngrok.io, so that’s what goes into the webhook, with an arbitrary suffix like githook. We only need push events. The json data type is cleaner, so that’s selected as a preference, and the final webhook setup looks something like this:

Processing Webhooks

We’ll read this new data into WordPress with custom logic. Due to the spaghetti-code nature of WP itself, it’s easier to circumvent it entirely with a small custom application. First, we’ll create the githook folder in the WordPress project’s root, and an index.php file inside it. This makes the /githook/ path accessible, and the hook will no longer return 404, but 200 OK.

According to the docs, the payload will have a commits field with a modified field in each commit. Since we’re only looking to update posts, not schedule them or delete them – those steps are still manual, for safety – we’ll only be paying attention to that one. Let’s see if we can catch it on a test push.

First, we’ll save our request data to a text file, for debugging purposes. We can do this by modifying our githook/index.php file:

Sure enough, our test.json file is filled with the payload now. This is the payload I got. You can see that we have only one commit, and that commit’s modified field is empty, while the added field has testfile.md. We can also see this happened on refs/heads/test-branch, ergo, we’re not interested in it. But what happens if we make a PR out of this branch and merge it?

Our payload looks different. Most notably, we now have refs/heads/master as the ref field, meaning it happened on the master branch and we must pay attention to it. We also have 2 commits instead of just one: the first one is the same as in the original PR, the adding of the file. The second one corresponds to the change on the master branch: the merging itself. Both reference the same added file.

Let’s do one final test. Let’s edit testfile.md, push that, and do a PR and merge.

We fetch the last commit in the payload, extract its modified files list, and find the parent folder of each modified file. The parent is dictated by the $lvl variable – in our case it’s 2 because the folder is 2 levels up: one extra for language (en_EN).

And there we have it – the path of the folder that holds the files that need to be updated. Now all we have to do is fetch the contents, turn the Markdown of those files into HTML, and save it into the database.

Processing Markdown

To process MarkDown, we can use the Parsedown package. We’ll install these dependencies in the githooks folder itself, to make the app as standalone as possible.

composer require erusev/parsedown

Parsedown is the same flavor of Markdown we use at Bitfalls while writing with the Caret editor, so it’s a perfect match.

We made some really simple functions to avoid repetition. We also added a mapping of language folders (locales) to their WPGlobus keys, so that when iterating through all the files in a folder, we know how to delimit them in the post’s body.

Note: we have to update all language versions of a post when doing an update to just one, because WPGlobus doesn’t use an extra field or a different database row to save another language of a post – it saves them all in one field, so the whole value of that field needs to be updated.

We iterate through the folders that got updates (there might be more than one in a single PR), grab the contents of the file and convert it to HTML, then store all this into a WPGlobus-friendly string. Now it’s time to save this into the database.

Note: we used a nonce at the end of the URL to invalidate a possible cache issue with raw github content.

Saving Edited Content

We have no idea where to save the new content. We need to add support for meta files.

This downloads the WP-CLI tool, puts it into the server’s path (so it can be executed from anywhere), and adds “executable” permission to it.

The post update command needs a post ID, and the field to update. WordPress posts are saved into the wp_posts database table, and the field we’re looking to update is the post_content field.

Let’s try this out in the command line to make sure it works as intended. First we’ll add an example post. I gave it an example title of “Example post” in English and “Primjer” in Croatian, with the body This is some English content for a post! for the English content, and Ovo je primjer! for the Croatian content. When saved, this is what it looks like in the database:

In my case, the ID of the post is 428. If your WP installation is fresh, yours will probably be closer to 1.

Now let’s see what happens if we execute the following on the command line:

This looks like it might become problematic when dealing with quotes which would need to be escaped. It’s better if we update from file, and let this tool handle the quotes and such. Let’s give it a try.

Let’s put the content :en}This is some English 'content' for a post - edited "again"!{:}{:hr}Ovo je 'primjer' - editiran "opet"!{:} into a file called updateme.txt. Then…

wp post update 428 updateme.txt

Yup, all good.

Okay, now let’s add this into our tool.

For now, our meta file will only have the ID of the post, so let’s add one such file to the content repo.:

It works – deploying this script now is as simple as deploying the WP code of your app itself, and updating the webhook’s URL for the repo in question.

Conclusion

In true WordPress fashion, we hacked together a tool that took us less than an afternoon, but saved us days or weeks in the long run. The tool is now deployed and functioning adequately. There is, of course, room for updates. If you’re feeling inspired, try adding the following:

modify the post updating procedure so that it uses stdin instead of a file, making it compatible with no-writable-filesystem hosts like AWS, Heroku, or Google Cloud.

custom output types: instead of fixed {:en}{:}{:hr}{:}, maybe someone else is using a different multi-language plugin, or doesn’t use one at all. This should be customizable somehow.

auto-insertion of images. Right now it’s manual, but the images are saved in the repo alongside the language versions and could probably be easily imported, auto-optimized, and added into the posts as well.

staging mode – make sure the merged update first goes through to a staging version of the site before going to the main one, so the changes can be verified before being sent to master. Rather than having to activate and deactivate webhooks, why not make this programmable?

a plugin interface: it would be handy to be able to define all this in the WP UI rather than in the code. A WP plugin abstraction around the functionality would, thus, be useful.

With this tutorial, our intention was to show you that optimizing workflow isn’t such a big deal when you take the time to do it, and the return on investment for sacrificing some time on getting automation up and running can be immense when thinking long term.

Bruno is a blockchain developer and code auditor from Croatia with Master’s Degrees in Computer Science and English Language and Literature. He's been a web developer for 10 years until JavaScript drove him away. He now runs a cryptocurrency business at Bitfalls.com via which he makes blockchain tech approachable to the masses, and runs Coinvendor, an on-boarding platform for people to easily buy cryptocurrency. He’s also a developer evangelist for Diffbot.com, a San Francisco-based AI-powered machine vision web scraper.