Migrating a large Blog installation from Movable Type to WordPress, lessons learned

First, although Movable Type is rock solid, personally I like it a lot but a number of apparently unsurpassable issues have come up that negatively affect the client, the once vibrant open source community that supported MT is now gone leading to lack of innovation and due to lack of peer review has led to an exploit causing this client to have suffered a 2 week outage, and among many of the changes they have announced, they have now moved to the closed source model.

I have just successfully completed the management of a fairly large, very busy migration from Movable Type to WordPress for a client, I was already aware of a number of complications and risks in the early planning stages. Benchmarks set were:

To maintain all content, either as publicly accessible or easily restorable data if called for throughout migration

Keep serious user disruption to around 6 hours

Reduce page load times by half

Increase traffic with new subscriptions and more page views per user

The challenges and risks were, apart from the site seemingly being under constant DoS attack, not the physical size of the database itself which was 1gb, I have moved much larger many times, or that it comprised of around 50,000 articles, that wasn’t much of an issue itself, but more the fact that the Movable Type installation had been previously migrated from another unknown platform some 5 years ago, and again a few years before that, so there are articles with php, htm and html file formats, never having been normalized, an even bigger task were the nearly 1 million comments on the articles which have been created over the last decade using an unknown number of diverse commenting systems, which if you have looked at such a mess this creates you’ll know you have some, in this case many of 12,000 commentators of them are duplicates, either or both username and email address applied to more than one commentators account.

There was a lot of pressure on me originally 5 months ago when this migration was first discussed, to move the whole commentating platform to Disqus from various stakeholders, while their platform is very tempting and would have simplified my tasks considerably, the client was worried that their terms of service seems to allow them to switch off a sites commenting system without notice, and ownership of the comments themselves becomes unclear, if you ever wanted to move away and their general attitude to security raised concerns, this was settled totally when they did in fact leak confidential data last month, I’d already voiced my concerns early last year while working on another migration. So I have for the mean time settled with the standard default WordPress system, in full agreement with many others that there just isn’t a perfect platform out there yet. But at least I will be in a good position if it falls on me to migrate to one when it does turn up.

Quite a few redirect problems still exist in the system, monitoring the many 404’s and editing them in the meantime, I have a Linux system engineer with the right experience fishing them all out in the next few days, ideally removing as much from the junk that slows down a system in htaccess as possible and solving redirects from httpd. Another complication, a risk I was aware of was that I migrated the system within the same web server, causing no end of issues in having multiple .htaccess’s effecting either or both installation during and after migration I think next time I will insist that migration is to an alternative server for a similar scale migration.

Again, getting WP Super Cache configured right was key to handling the 5,000 visitors an hour, and the hammering Google and Microsoft’s spiders give a system with this much content.

Despite the false start a month ago when the client realized they weren’t prepared for the migration, it’s gone well, recovery was good, while many will say “bounce rate” is really more a webmasters vanity tool, I think as you see in this graph from Google Analytics below illustrates, that on the 20th the day of the migration many curious regular users were “poking around” the new platform, this subsided after a day, and is now starting to drop at a steady rate (the lower the better) as they are making use of the new features. I had set this as a benchmark and consider this a proof of success.

Keeping everyone happy throughout the migration and with the new platform we are now forcing on them to use is impossible, warning them of the expectations would have been desirable, but we intentionally didn’t inform browsers and commentators of the upcoming move which would have relieved the current pressure on helpdesk, but that was unavoidable as there are those that would have taken advantage of the migration (escalating the DoS attacks?).

Lessons learned, while redirects to the new rss feed worked fine, I was unaware that other sites were “fed” by a third party web feed management provider that failed to pick up content from the redirected xml, once the issue was identified, it was easily fixed, building and distributing a new feed with Google’s Feedburner.