The difference is in how they treat U+0000 (NULL) characters: UTR 26 does not treat it in any special way (i.e. it is encoded as "\x00"), but the Java definition treats specially, encoding it as "\xC0\x80". The IANA registration refers to the Unicode definition (see https://www.iana.org/assignments/charset-reg/CESU-8). TR 26 explains that "CESU-8 is useful in 8-bit processing environments where binary collation with UTF-16 is required.". For this to work, U+0000 has to be encoded as "\x00".