Switch WordPress from utf8 to utf8mb4 retrospectively

The WordPress 4.2 upgrade changed the database character set from utf8 to utf8mb4 (for example necessary for full emoji support) if a few requirements were met. This article shows you how to make the switch manually if you did not meet the requirements at the time of the WordPress 4.2 upgrade.

Requirements for the utf8mb4 update

Gary Pendergast lists the following requirements in his post »The utf8mb4 Upgrade« (quoted from his post):

  • You’re currently using the utf8 character set.
  • Your MySQL server is version 5.5.3 or higher (including all 10.x versions of MariaDB).
  • Your MySQL client libraries are version 5.5.3 or higher. If you’re using mysqlnd, 5.0.9 or higher.

Because my hosting provider Uberspace currently uses CentOS 6, which cannot reach the needed MySQL version, I did not get the update automatically with 4.2. After failing with saving an Emoji today, and getting the hint with utf8mb4 from Bernhard Kau, I searched for a solution. I found it in the post »utf8mb4 auf uberspace« on crusy.net — the solution for me is switching to MariaDB.

If you now simply have a MySQL version which is high enough, you can likely change the character set via phpMyAdmin or similar tools (but I did not test that, because that would currently be no solution for me because of the above-named reason).

In any case, you need to change the tables to utf8mb4 after that — changing the database character set is not sufficient.

Changing the tables to utf8mb4

I found the solution for this problem on wordpress.stackexchange.com. There the update routine from 4.2 (respectively 4.3, since the core team of WordPress pushed the routine to 4.3, to get more sites updated) is put into a plugin or functions.php of your theme and called via the URL parameter ?update-utf8bm4=1. With that, all tables are switched to utf8mb4, where possible.

I inserted that into a small plugin, which you can find as a Gist on GitHub. On a multisite, you have to call the script on each site of the multisite. If you are done with the change, you should remove the plugin!

7 reactions on »Switch WordPress from utf8 to utf8mb4 retrospectively«

  1. But... I think the table has a setting which your plugin changes to "utf8mb4" and then each text column has a setting too, which needs to be changed too shouldn't it?

  2. "where possible" is doing a lot of work… I think I ran this correctly but nothing seems to have changed.

    This annoys me when I look at URL stats and see all these miscoded URLs.

    Seems like the WP dev team might have offered a way to make this conversion when they changed the database parameters. I am running 5.7.2, so there has been enough time to catch up. Who knows the design better than them? Lots of HOWTOs but I don't understand any of them well enough to apply them.

    Changing the encoding in a SQL session kicks back errors as well. Wondering if an export/import to a new database is my best option.

    1. Hi Paul,

      with creating a new WP installation on your host you could at least check if the hoster meets all requirements for the utf8mb4 encoding. If it works, you could export/import your data to the installation.

      Best,
      Florian

  3. Thanks a lot for this Florian! It pointed me towards the right direction.

    Somehow after a server upgrade I ended up not being able to save posts that had emojis in the editor... Somehow, my wp tables had a utf8mb3 encoding. So I used your plugin as a starting point and modified the maybe_convert_table_to_utf8mb4( ) to make it work with utf8mb3 encoding. I ended up with smth like this in case anyone has the same issue:

    add_action( 'wp_loaded', 'update_db_to_utf8mb4' );
    function update_db_to_utf8mb4() {
    if ( ! isset( $_GET['update-utf8bm4'] ) ) {
    return;
    }

    require_once( ABSPATH . 'wp-admin/includes/upgrade.php' );
    /** WordPress Administration API */
    require_once( ABSPATH . 'wp-admin/includes/admin.php' );
    /** WordPress Schema API */
    require_once( ABSPATH . 'wp-admin/includes/schema.php' );
    global $wpdb;
    if ( is_multisite() ) {
    $tables = $wpdb->tables( 'blog' );
    } else {
    $tables = $wpdb->tables( 'all' );
    if ( ! wp_should_upgrade_global_tables() ) {
    $global_tables = $wpdb->tables( 'global' );
    $tables = array_diff_assoc( $tables, $global_tables );
    }
    }

    foreach ( $tables as $table ) {
    global $wpdb;

    $results = $wpdb->get_results( "SHOW FULL COLUMNS FROM `$table`" );
    if ( ! $results ) {
    break;
    }

    foreach ( $results as $column ) {
    if ( $column->Collation ) {
    list( $charset ) = explode( '_', $column->Collation );
    $charset = strtolower( $charset );
    if ( 'utf8' !== $charset && 'utf8mb3' !== $charset && 'utf8mb4' !== $charset ) {
    // Don't upgrade tables that have non-utf8 columns.
    break;
    }
    }
    }

    $table_details = $wpdb->get_row( "SHOW TABLE STATUS LIKE '$table'" );
    if ( ! $table_details ) {
    break;
    }

    $table_charset = strtolower( $table_details->Collation );
    if ( 'utf8mb4_unicode_520_ci' === $table_charset ) {
    continue;
    }

    $wpdb->query( "ALTER TABLE $table CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_520_ci" );
    }
    }

    Thanks again

    P.S. still using your very nice lazy load plugin 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *