The other day I was experimenting with kind of order-preserving encryption (OPE) in a context of approximate string matching, and came up with unicode-ranges, which is a multipurpose PHP library that provides with some functionality for UTF-8 operations in an object-oriented way.

Random UTF Chars

More specifically, I needed to create a strange fake alphabet by implementing an idea based on generating random UTF characters of any given Unicode subset.

The API of Unicode Ranges is easy to use, and among other things allows to easily group OO Unicode blocks just by using the human-readable names of the ranges rather than the computer-centered hexadecimal (or decimal) blocks.

Let’s look at some examples.

The following snippet prints information about the AlchemicalSymbols range:

Returning to the issue of OPE encryption, let me show you how to mimic the English alphabet with the help of the object-oriented Unicode ranges.

use FuzzyMatching\Alphabet\EnglishAlphabet;
use FuzzyMatching\Alphabet\MimickedAlphabet;
use UnicodeRanges\Range\AlchemicalSymbols;
use UnicodeRanges\Range\HangulJamo;
use UnicodeRanges\Range\Ugaritic;
$mimickedAlphabet = new MimickedAlphabet(new EnglishAlphabet, [
new AlchemicalSymbols,
new HangulJamo,
new Ugaritic,
]);
print_r($mimickedAlphabet->getLetterFreq());

As you see, this faked version of the English alphabet consists of random AlchemicalSymbols, HangulJamo and Ugaritic characters all together — for further information on my order-preserving encryption experiment look at Fuzzy Matching OPE Encryption.

And that’s how to generate random UTF-8 characters with PHP. I hope you enjoyed today’s post. See you next time!