Perl Unicode Cookbook: String Length in Graphemes

℞ 33: String length in graphemes

If you learn nothing else about Unicode, remember this: characters are not
bytes are not graphemes are not codepoints. A user-visible symbol (a
grapheme) may be composed of multiple codepoints. Multiple
combinations of codepoints may produce the same user-visible graphemes.

To keep all of these entities clear in your mind, be careful and specific
about what you're trying to do at which level.

As a concrete example, the string brûlée has six graphemes but
up to eight codepoints. Now suppose you want to get its length. What does
length mean? If your string has been normalized to a one-grapheme-per-codepoint
form, length() is one and the same, but consider: