See also the ICU page for lots of detailed documentation on how Unicode is supposed to work in running, software, including discussions of what can possibly go wrong.

There are three "encoding" forms, UTF-8, UTF-16, and UTF-32; there are also UCS-2 and UCS-4.

JSON must be unicode

The default encoding form of JSON is utf-8 unicode, which effectively means it must be supported, but JSON data can also be delivered in the other two forms

SPARQL syntax is UTF-8 Unicode: "The encoding is always UTF-8 [RFC3629]. Unicode code points may also be expressed using an \uXXXX (U+0 to U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a hexadecimal digit [0-9A-F]". In other words, the SPARQL must detect and reject non-utf-8. But it isn't clear if a conformant SPARQL parser must accept unicode expressed with escapes (which is essentially utf-7).