Importing a WordPress database: How to fix the Unknown collation: ‘utf8mb4_unicode_ci’ error

If you do a lot of exporting and importing to different database servers, you’ll be familiar with the frustration of encountering MySQL import errors. Every so often when importing a WordPress dump file into a client’s database, I will encounter an Unknown collation error like the following:

Unknown collation: 'utf8mb4_unicode_ci'

Sometimes it will come up as:

Unknown collation: 'utf8mb4_unicode_520_ci'

This is caused by a difference in encoding types between the source and destination databases. It usually happens when you export from a newer MySQL database (MySQL 5.5.3 and above) which uses utf8mb4, then attempt to import into an older version using utf8. If you are importing from a dump file generated from a MySQL 5.6 database, you may get the utf8mb4_unicode_520_ci message. The 520 refers to MySQL’s use of Unicode Collation Algorithm 5.2.0. Unknown collation errors may also happen if you are trying to import a MariaDB database into MySQL. I tend to get unknown collation errors with my Rackspace Cloud accounts after Rackspace started offering MariaDB as a database option.

Ideally one would upgrade the older destination database but this isn’t always a realistic option. There are a number of discussion threads on the WordPress forum about what to do. Fortunately, many web hosting accounts have a phpMyAdmin interface which provides an easy work-around for the problem.

Format-specific options during a phpMyAdmin database export

Log in to your database server using phpMyAdmin

Make sure you select your database and go to the “Export” tab

Select the “Custom” radio button

Go the section “Format-specific options” and in the setting for “Database system or older MySQL server to maximize output compatibility with:” select MYSQL40.

Since many of my WordPress database migrations are under my migration service, I don’t always have control over the client’s platform. The phpMyAdmin export format method is often the simplest solution.

Side-effects of a character encoding downgrade

You might be wondering about the purpose of encoding types and if there will be any side-effects of downgrading. Character encoding allows support for a set of characters, such as the Western alphabet, Asian scripts and non-alphanumeric symbols. Older utf8 databases support a smaller set of characters whereas utf8mb4 includes emojis, musical notation and Chinese Han characters. If you’ve ever exported a website from one CMS to another and found random characters scattered throughout the copy, it’s because of an incompatible character encoding.

Solving the unknown collation error as described here could mean you’ll end up with unsupported characters after your site migration. However, as with many of my Drupal to WordPress migration clients, in all likelihood you’ll be migrating from an olderutf8 Drupal database to a newerutf8mb4-supported WordPress database. In this case, your old content will not have characters that will cause a problem after an encoding downgrade.