Currently, trying to add emoji characters inside posts and private messages triggers an error. It would be nice to support emoji characters, particularly so that mobile users can receive a better experience. Emojis would also be able to replace the existing smilies system, providing a wider range of higher-resolution images.

Caveats

If we replace smilies with emojis, existing forums will have to convert the smilies posts and private messages in their databases to emojis.

Based off of the implementation details described in the ticket and further down in this post, emojis will require versions of MySQL >= 5.5.3, so existing databases may have to upgrade to a later version.

Implementation Details

We have currently built an extension to emulate emoji support for phpBB forums, implemented with the following details:

We have converted the character sets for the post_text and message_text columns to utf8mb4 to support Emoji unicode characters.

Emojis can either be inputted via. their unicode characters or through "shortnames" - codes representing emojis similar to the smiley codes, that are translated using the EmojiOne library (https://github.com/Ranks/emojione). Emojis are stored in the database as-is: if they were inputted as unicode characters they are stored as characters, and if they are inputted as shortnames, they are stored as shortnames.

Because emojis are not consistent across browsers and support for it is lacking in some browsers (ie. Emojis are all black/white in Mozilla Firefox), we use the Twitter Twemoji library (https://github.com/twitter/twemoji) to convert emoji characters to standardized images.

We would like to propose for phpBB to natively handle emojis sometime in the future, but even if emoji support will stay as an extension for now, there are some implementation changes that we would like to see done in phpBB:

Character sets for post/message tables should be in utf8mb4 by default instead of utf8.

The mysqli class should perform its transactions using the utf8mb4 character set instead of the utf8 character set.

The message parser should not trigger an error if it finds emoji characters/characters outside the ASCII set, or the 'posting_modify_submission_errors' event should be fired when a private message is being composed.

Last edited by rfdy on Thu Apr 23, 2015 1:52 pm, edited 2 times in total.

phpBB 3.2 uses my library s9e\TextFormatter which makes it easy to add support for Emoji via an extension.

The biggest problem with emoji is, as you pointed out, that MySQL doesn't support Unicode characters outside of the Basic Multilingual Plane and that phpBB goes as far as preventing any such character from being used. I thought about it and concluded that it was an issue that was better solved by encoding the offending characters rather than change the database schema. s9e\TextFormatter stores formatted text as XML so it's easy to encode those characters as numeric entities. I published an update that does just that. Whenever phpBB 3.2 updates its bundled version, support for SMP characters will be possible.

If you want to make an extension for phpBB 3.2 to enable Emoji, message me and I'll help you configure the library.

rfdy wrote:We have currently built an extension to emulate emoji support for phpBB forums

The extension also makes use of emoji codes, such as allowing desktop users to enter text w/o having to rely on browser support. Saving the text as XML encoded characters doesn't seem like the best solution either. Enabling utf8mb support in phpBB still seems like the most flexible approach to storing emoji though.

The text is already stored as XML, it makes perfect sense to use XML entities for those characters. And it works without touching the database schema, which I consider a huge plus. No database change = you only have to update a couple of PHP file, takes only a minute to upgrade.

XML is used as a storage format, the amount of resources spent dealing with XML is negligible. The amount of resources spent dealing with Emoji is relatively insignificant, they add less than a millisecond during posting and a few microseconds when displaying.

In the screenshot above, I used the library's Emoji plugin which handles both Emoji and their shortcode/short name, no configuration required. It works with Emoji One, using the images from their CDN. I haven't looked into Twemoji.

What's possible is you create an extension that uses something like EmojiOne or Twemoji to replace the SMP emoji with their ASCII short name for storage, and use the same library to replace them with images.