WordPress Core Contributor

Especially among WordPress and JavaScript developers, we’re all rather sick and tired of this bug with LinkedIn endorsements. Ever since they launched a year and a half ago in September 2012, this has been a problem. How long does it take to fix this?

What am I talking about? Take a look at my endorsements:

See that? Not only is the WordPress community so anal about the capitalization of “WordPress” that they’ve built a destructive rewrite into the software itself, but when you browse my skills, you probably think I’m alright with WordPress, but it’s not one of my primary skills. In fact, it’s actually my 3rd most endorsed skill if you count them correctly as combined. It’s just ridiculous that these are counted as separate skills.

That’s bad enough on it’s own, but that’s not the only problem. For the entire year and half that endorsements have been a part of LinkedIn, I haven’t been able to edit them at all. The skills editor actually does think these are duplicate skills. If you have any duplicate skills added in the editor, it won’t allow you to add or save any of your changes unless you delete one of them:

So I have a choice now. If I want to ever finally delete some of these pointless and maybe even incorrect endorsements (LinkedIn pushes your network to endorse you for related skills even when you didn’t list them), I will absolutely have to delete my legitimate endorsements for one of the lesser endorsed spelling variations of “JavaScript” and “WordPress”.

I’ve held off on this for two reasons. First, I figured LinkedIn would eventually fix this, and combine them like they should be, and let you decide the correct capitalization while they’re at it. I even submitted a formal bug report over six months ago. Second, even if I remove one of each of them now, LinkedIn will still push my network to re-endorse the incorrectly spelled variations all over again, and put it into a broken state yet again just as soon as I’ve deleted them. So I haven’t ever edited my endorsements since they launched back in 2012.

WP-CLI is a powerful tool that can help make common WordPress maintenance tasks easier, especially if you want to automate those tasks using cronjobs in a secure way. Here at Bluehost, we love this tool, and even provide it pre-installed on every hosting account.

There are some gotchas that can confuse even developers that are already familiar with WP-CLI. So I wanted to quickly outline them.

Using the Pre-Installed Command

The official instructions for WP-CLI instruct users to install it as the super short and convenient “wp” command. However, because our shared hosting accounts support multiple versions of PHP including version 5.2, which WP-CLI does not support, we were required to provide a wrapper around this command called “wpcli”. So every time you see instructions for running a WP-CLI command, you must replace “wp” with “wpcli”. This wrapper already knows how to force WP-CLI to always use PHP 5.4 (or later). For example, here’s how to check which themes you have installed:

We rely on this tool ourselves to help provide you higher quality support, but in the interest of security, we’re required to review updates to any tools we install on our servers, so sometimes our version of WP-CLI might not be the latest version available. If you find yourself needing some newer functionality or fixes to bugs within WP-CLI, you might still want to install the latest version on your hosting account yourself…

Installing WP-CLI on your Hosting Account

Since shared hosting does not provide you with root access to the server, the recommended installation instructions provided by WP-CLI won’t work. Instead, we’re going to install WP-CLI as a “user-specific installation”. It is still perfectly possible to install WP-CLI without root access without losing any functionality provided by WP-CLI.

We are still going to download WP-CLI in the recommended way, however, we are going to download it into your ~/bin directory instead:

And rather than using the “php” command as instructed by WP-CLI, Bluehost accounts are required to explicitly use the “php-cli” command to run PHP in command line mode. So you can check that WP-CLI is working correctly by running this:

WP-CLI is actually installed now, however, it’s not very convenient to use still. You don’t want to have to run “php-cli ~/bin/wp-cli.phar” every time you want to run a WP-CLI command, so we’re going to setup an alias for it so you can just run “wp” like you normally would.

To do this, we need to edit our ~/.bashrc file, and add our alias right alongside any other aliases setup for our account. You’ll notice that Bluehost has provided a few basic ones in this file already. We’re just going to add one more, the location in the file isn’t important (but I like to organize all of my aliases in one spot next to each other). Just add the following line to this file:

alias wp='php-cli ~/bin/wp-cli.phar'

Once you save this file, we’re done. You can either log out and SSH back in for this alias to be loaded, or you can run the following command to get it working immediately:

So back when the idea of Ghost (the hyped blogging platform alternative to WordPress) was first proposed, John O’Nolan suggested that he had no idea what framework would be used for it, but that it didn’t matter at the time.

Now that it has hit KickStarter, he has announced plans to build it out on Node.js with Express. Of the numerous frameworks he could have chosen for this, he chose one of the few that most (if not all) shared hosting providers don’t support. Node.js web applications require running a node server on it’s own port on the server, and this is something you can only really do on VPS and Dedicated hosting (it’s no wonder one of the partner backers is SingleHop – who only provides dedicated hosting).

Just like WordPress, Ghost will likely have a cheap hosted solution available from the creators of Ghost (think WordPress.com), however, just like WordPress, if you plan to write up your own extensions/plugins, you’ll likely need your own hosting. Shared hosting accounts for more than probably 90% of all custom WordPress.org installations.

It makes me wonder how many of Ghost’s backers are going to be incredibly disappointed when they realize they won’t be able to host their own Ghost website without paying out the nose for hosting and performing a fairly complex installation. That’s quite a bit to ask for someone only setting up a blog.

It’s a fairly well known fact that Subversion performs poorly when it comes to storage compared with all the latest version control systems. Disk space is cheap though, so I simply reserve plenty of it for this purpose. I generally never thought twice about the size of an SVN checkout or the size of any of my own repositories knowing they would be large. This changed when I started a new project that required syncing the entire WordPress plugins SVN repository, and not just because I knew it was in the running for the largest SVN repositories in regards to number of commits.

When I first started syncing the plugins SVN repo, I had already used Mark Jaquith’s plugin directory slurper that downloads the latest copy of every single plugin in the repository so they are readily available to scan for various statistics about core API usage, and general programming habits among plugin authors. This comes out to about 7GB uncompressed. However, this was just the latest copy of all plugins, it didn’t include the code history. Given the typical nature of SVN, and the 7GB figure I had to go off of, I had originally estimated that the synced mirror might be between 40GB and 60GB. This is pretty big, but certainly not a problem for today’s drives.

So I started the sync, and let it run for a couple days. I had the first 80,000 revisions done, so it was time to revise my estimates for a more accurate number. With the first 80,000 revisions, the total repository size on disk was right around 13GB. So, you figure at 8 times as many revisions (640,000 – close to the total revision count currently), you might have somewhere around 104GB. Well, this is certainly much higher than I had anticipated, but this still doesn’t pose a problem.

I let the sync run for a couple more days. I now had 160,000 revisions, and the total repository size on disk was 46GB. Wait a minute, shouldn’t that have been about 26GB? That’s not anywhere close to my estimates. I would expect some deviation to account for better graphics and some other minor things taking up more space, but not anything that big. At this rate, I knew I was looking at a total size probably over 200GB. This kind of behavior doesn’t happen with any of my other SVN repos. It was now time to do some investigation.

Below you can see a graph of the total repo size over the course of adding in each new FSFS pack. One single pack represents 1,000 revisions, so pack 80 would put the repository at revision 80,000.

This paints a much better picture of what is going on, but still doesn’t tell us why. The exponential repository size growth is very obvious now, and what’s more interesting is that it’s a very predictable and stable curve. This has nothing to do with any crazy and reckless commits by destructive plugin authors (you do see that happen in pack 80, but it doesn’t even make a dent in the overall repo growth). Taking this growth into account, my new estimate for total repo size at 650,000 revisions was now 450GB, ten times what I had originally expected.

An experienced Subversion administrator should be able to tell you what’s wrong here. This graph clearly shows consistent growth of pack sizes where there shouldn’t be any (or at least very minimal growth). This blue line should never break above maybe 50MB except for that reckless commit in pack 80, and like pack 80, if it ever does, the next pack should not be affected in any way.

The graph also clearly shows that whatever growth is here, it’s contained within just about every single pack in the SVN repository, and at this point, it’s going to be obvious to find since it accounts for more than 95% of the contents of every commit if we’re looking at any commit beyond revision 60,000 or so. So one way we can find the cause is by simply picking out a single revision preferably with just a one line change, and identify what SVN is storing in that revision that accounts for 95% of the contents that isn’t related to the actual changes made in that revision.

So, I pick out revision 178,012. It’s a single line change to bump the stable tag of the “infinite-scroll” plugin by it’s maintainer, paul.irish. The raw FSFS revision file contents contains 43,657 lines of data for a total of 576KB. There’s about 10 lines including one binary delta identified as the stable tag bump to “/infinite-scroll/trunk/readme.txt”, about 30 lines showing the revision properties for the “/infinite-scroll/trunk” node that this file is contained in – mostly identifying the FSFS node IDs for all files and directories contained in that node including readme.txt, another 20 lines showing the revision properties for the “/infinite-scroll” node – containing 3 revision properties identifying the FSFS node IDs for the branches, tags, and trunk nodes under the plugin node, and finally, a remaining 43,587 lines of data for revision properties of the root repository node (“/”) containing FSFS node IDs for every directory in the root node, which happens to be a list of every single plugin in the SVN repository accounting for 99.99% of the contents of the entire revision.

It turns out that Subversion’s storage mechanism requires naming off any related node properties on any changed nodes with every revision, including sibling nodes. Every single commit is going to be related to the root repository node, so every single commit is going to contain the list of all plugins in the repo. As new plugins are added to the repository, it’s name and FSFS node ID will be added to that list for every new commit from that point forward.

How do we fix it?

Any solution to this problem is going to involve some painstaking infrastructure changes with the way the plugins repository works. So the short answer is that it’s going to take years to fix. However, if nothing is done in the next two years, the SVN repository will double in size to about 900GB, and it’s performance will quickly degrade as the server takes longer to read revisions and the filesystem cache can no longer be used (which I suspect is already the case now). We can continue to toss new hardware at the problem. This is expensive though, and there is a point where that can’t even solve the problem anymore anyway.

Thankfully the svnadmin dump utility typically used to make backups does not have this storage problem. A dump of revision 178,012 mentioned above that’s 576KB in the repo is actually only 661 bytes in a dump. A dump of the entire repository should only be about 30GB. So performing backups does not pose a problem with this repository (other than the length of time it takes to perform a dump with an inefficient repository).

Our first step should at least be setting up new plugins in their own repositories assuming WordPress continues to support SVN for plugins in the future. That way only commits to existing plugins continue being wasteful, and the repository would stop growing exponentially.

We could export existing plugins into new repositories, however, this would have to be a decision made by the plugin maintainer since it will require a new URL to the checkout, and might even require rewriting revision numbers (although it is possible for the SVN dumpfilter utility to leave empty revisions in place to maintain revision numbers). We could probably do this automatically for any plugins without a commit in the last 2 years without any complaints, and with that option, we could even go as far as using dumpfilter to pro-actively remove those nodes from the legacy repo. That could easily cut the repository size in half, speeding it back up significantly.

Just to quickly get this out of the way, I know someone is thinking “what about a migration to git?” Let me just clarify that while I’m all for adding git support to Extend, this is something that can not be forcefully pushed on everyone that already has plugins in the repo, and we certainly shouldn’t only offer git for new plugins either. You are right, it would help a little bit, but it doesn’t solve the problem. It would also take significantly longer to implement than any other solution here, and we don’t have a lot of time to solve this.

As a core contributor on WordPress for a while now, one of the biggest problems I’ve found is that WordPress has failed to implement a realistic public API versioning model that could be helping plugin and theme authors, core developers, and users alike.

I want to get the discussion going on a new model that helps everyone, and I would like to get your input and support on bringing this major policy change into WordPress core. I’ve briefly discussed this with a very small set of plugin and theme authors already, and have come up with a solution I think everyone can get behind, but it still needs to be polished up, so I’m expanding the discussion to a larger audience for further input.

Where are we at right now?

The current WordPress API policies regarding deprecation and backwards compatibility is simple: “No public API is ever removed.” We do add deprecation notices to old methods that have been replaced with new API, but deprecated methods are never removed.

This results in the following issues I would specifically like to address:

Outdated plugins and themes can never be removed from the WP.org plugin and theme repositories. When do plugins and themes die? The current policy says everything should still work, therefore, nothing should ever be removed. We know this doesn’t happen in practice though, and it’s the main reason we see big, bright warnings on items not updated within the last 2 years. Forget about finding quality plugins, this makes it difficult for users to find working plugins and themes in the repositories. According to plugin compatibility votes, 1 out of every 5 plugins in the WP.org repository is broken in the latest version of WordPress (consistently for the last 3 years).

This policy is incompatible with every 3rd party library that is included with WordPress core. If we were to strictly adhere to this policy with 3rd party libraries, WordPress would be forced to never update any of the following libraries after initially adding them (or only updating minor releases within the same stable branch that was first included): jQuery, jQuery UI, TinyMCE, SimplePie, PHPMailer, POP3 (SquirrelMail), PemFTP, Plupload, Backbone, Underscore, and others. Many authors have seen their plugins and themes break when one of these libraries are updated in core anyway, despite the current backwards compatibility policy. We saw core break recently while updating SimplePie from 1.2 to 1.3 (during 3.5 development).

This is one of the biggest contributors to the project’s technical debt. New features and infrastructure changes to WordPress core are limited to designs that remain compatible with WordPress versions as old as 1.0. This limits changes that desperately need to be done such as replacing the DOING_AJAX constant that makes unit testing WordPress incredibly difficult. We know it needs to happen, but it simply can’t be done because several plugins rely on this constant. It also means that the core WordPress code base is still littered with super rarely used code which, without a change in policy, is required to remain intact and maintained for decades to come (until someone arbitrarily decides on some random WordPress version to remove it in order to make way for a new feature).

This is a roadblock standing in the way of automatic WordPress upgrades. While WordPress is getting smarter about automatically deactivating plugins that generate PHP errors during activation, it is still impossible to tell if any single plugin is going to break after core has been upgraded (particularly with Javascript-heavy plugins, note the libraries mentioned above). Hosting providers like the one I work for (Bluehost) are very concerned about pushing WordPress upgrades automatically if there’s a significantly high chance it will break the customer’s website. The numbers are hard to calculate since the chances that something will break rise depending on the number of plugins installed. Currently though, it’s roughly 4% to 8% per plugin (4% if upgrading a single WordPress release like 3.3 to 3.4, and about 8% if upgrading from 2.8 to 3.4).

You might notice that a couple of the issues mentioned are only actually problems because despite the current compatibility policy, in order for WordPress to move forward, it has to break some compatibility. It’s unreasonable to expect WordPress to stick with TinyMCE 2.1 and jQuery 1.1 that was included in WordPress 2.2. It’s also unreasonable to expect that every feature ever added to WordPress core remains in core, such as the Links Manager being removed between 3.5 and 3.6 or the PHP-Gettext library that was removed in 2.9. Smaller backwards incompatible changes can even be found like this.

The truth is, even though WordPress still includes API deprecated all the way back to version 1.5, a quality plugin or theme written that long ago simply will not work with WordPress 3.4 unless it only ever tied into one or two hooks, only called a few core methods, and never touched any Javascript libraries. Neither will a plugin written for WordPress 2.5.

The Versioned API Approach

When discussing automatic updates, Matt has often referred to the future of WordPress working in a very similar fashion to Google Chrome. Users shouldn’t need to know or care what version of WordPress they have installed. This should also be the case when installing plugins and themes, so when we look at how we can turn the plugin repository into a version-free world like the Chrome Web Store, we have to look at how they handle these issues with Chrome extensions and apps.

Chrome extensions are all required to specify a manifest version that dictates which extension APIs that version of the extension uses. Any deprecated extension APIs are tied to the current manifest version, and are slowly phased out and removed entirely according to a lengthy schedule (a period of about two years) spanning several Chrome releases. Once a manifest version is officially deprecated, nothing changes for a period of about a year. Then the Chrome Web Store starts blocking new extension submissions for a few months, then it starts blocking updates to extensions using the deprecated manifest version. A few months later, old extensions are filtered out of search results, the wall, and category listings (just like how WordPress plugins more than 2 years old are filtered). Developers of those extensions are notified about removal from the Web Store at the same time, which will happen a few months after those notifications. Another few months after that is finally when the next release of Chrome will automatically deactivate any deprecated manifest version extensions and remove all API tied to that version.

If we were to apply the same policy to WordPress plugins and themes, here’s how such a schedule might look like for the WP.org repositories (dates are estimates based on three month release cycles):

January 2013 (before WordPress 3.6)

API deprecated before WordPress 3.5 is marked for manifest version 1 compatibility, anything deprecated after is marked for manifest version 2 compatibility.

Documentation should be updated for “Writing a Plugin / Theme” on new requirement to indicate manifest version, and noting manifest version on all deprecated API docs.

WordPress 3.8 (September 2013)

Extend blocks all new manifest version 1 submissions (plugins and themes). All items with unspecified manifest version are assumed to be version 1. This forces all new items to use manifest version 2.

WordPress 3.9 (December 2013)

Extend blocks updates to existing plugins and themes if manifest version 1.

WordPress 4.0 (March 2014)

Extend no longer lists manifest version 1 plugins and themes on any search or browse page, but leaves items listed (only accessible by bookmark, or external links).

Notice emails are sent to all authors with manifest version 1 items informing of removal from Extend within 3 to 4 months if they are not updated.

WordPress 4.1 (June 2014)

Version 1 plugins and themes are removed from Extend.

WordPress 4.2 (September 2014)

WordPress core will auto-deactivate manifest version 1 plugins and themes (potentially falling back to the default bundled theme). Version 1 items refuse to activate, but files are left installed.

Note that while this schedule outlines WordPress release versions, only the last step in this schedule is required to be tied to an actual WordPress release. Every step before it can be extended to any date if it looks like authors are having trouble getting plugins updated.

Remaining Issues

Chrome only provides one single Javascript API that can be easily versioned, it doesn’t include any 3rd party Javascript libraries. WordPress can’t version any 3rd party libraries in the same way, nor can it provide two different versions during the transition period where two manifest versions are supported. We will still see issues while updating 3rd party libraries, however, we will at least have a better method of timing those 3rd party library updates so they break the least number of plugins and themes as possible.