I recently joined Automattic as a Code Wrangler (more on that soon), and I decided now was the time to de-spam, combine and resuscitate my old WordPress blogs.

Along the way I noticed a few annoying encoding issues crept in. The Wordpress XML exporter/importer, for example, double-escaped all my quotes so that <a href=”this”> became <a href=\”this\”>, which as you can imagine broke… everything.

Once I ran these few queries, I couldn’t see any more formatting errors. Others online have advised disabling the line “define(‘DB_CHARSET’, ‘utf8’)” in the wp-config.php. This seems inadvisable to me as it may allow your database to become even more inconsistent, by not enforcing a single clear and future-proof encoding on all content.

Now that everything’s been cleaned up my old content looks great and I can move on to customising the theme of my site.